Your supervisor has asked you to draft documents that the stakeholders can review prior to your presentation. You decide to clarify the purpose of the embedded assessment for proposed program improvements.
Draft a 525- to 700-word memo to the stakeholders in which you describe the need, intent, goals, and objectives of the evaluation plan you wish to be implemented.
Provide your statement of purpose. Include your vision, mission, and goals. Answer the following questions:
- -What key questions need to be addressed?
- -What evidence of accomplishment do you seek?
- -Who are the stakeholders?
Provide 1 or 2 examples of the evaluation methods (described in Chapter 8 of the textbook- SEE ATTACHED DOCUMENT BELOW) that you would like to see incorporated.
- -What is your rationale for selecting these?
- -What are the financial and human resources required to strengthen the design of the evaluation?
- -From which stakeholders can you acquire the most impactful guidance?
1688421 – SAGE Publications, Inc. (US) ©
evaluation design will still yield useful information. Yet a strong logic model within a rigorous evaluation design will enable much stronger conclusions regarding program effectiveness and impact. As you have likely surmised, a weak logic model within a strong evaluation design provides little useful information, just as an unreadable treasure map within a sturdy home brings you no closer to the treasure. That said, in this section you will add strength and depth to your logic model by continuing to build upon the evaluation matrix you began in Chapter 7. Methods and tools will be identified or developed for each indicator on your logic model, addressing the question, How will you collect your data?
Although there are many evaluation methods, most are classified as qualitative, quantitative, or both. Qualitative methods rely primarily on noncategorical, free response, observational, or narrative descriptions of a program, collected through methods such as open-ended survey items, interviews, or observations. Quantitative methods, on the other hand, rely primarily on discrete categories, such as counts, numbers, and multiple-choice responses. Qualitative and quantitative methods reinforce each other in an evaluation, as qualitative data can help to describe, illuminate, and provide a depth of understanding to quantitative findings. For this reason, you may want to choose an evaluation design that includes a combination of qualitative and quantitative methods, commonly referred to as mixed methods. Some common evaluation methods are discussed below and include assessments and tests, surveys and questionnaires, interviews and focus groups, observations, existing data, portfolios, and case studies. Rubrics are also included as an evaluation tool that is often used to score, categorize, or code interviews, observations, portfolios, qualitative assessments, and case studies.
Qualitative methods: evaluation methods that rely on noncategorical data and free response, observational, or narrative descriptions.
Quantitative methods: evaluation methods that rely on categorical or numerical data.
Mixed methods: evaluation methods that rely on both quantitative and qualitative data.
Before delving in to different methods, it is worth mentioning the ways in which the terms assessment and survey are sometimes used and misused. First, while the term “survey” is sometimes used synonymously with “evaluation,” evaluation does not mean survey. A survey is a tool that can be used in an evaluation and it is perhaps one of the most common tools used in evaluation, but it is just one tool nonetheless.
Another terminology confusion is between “assessment” and “evaluation.” These too are often used interchangeably. However, many in the field of evaluation would argue that assessment has a quantitative connotation, while evaluation can be mixed method.
Similarly, the term “measurement” is often used synonymously with “assessment,” and measurement too has a quantitative connotation. I believe the confusion lies in the terms “assess,” “evaluate,” and “measure”; they are synonyms. So, it only makes sense that assessment and evaluation, and sometimes measurement, are used synonymously. And while there is nothing inherently wrong with using these terms interchangeably, it is a good idea to ask for clarification when the terms assessment and measurement are used. Some major funders use the term “assessment plan” to mean “evaluation plan,” but others may use the term assessment as an indication that they would like quantitative measurement. The takeaway from this is to communicate with stakeholders such that the evaluation (or assessment) you design meets their information needs and expectations.
8.2.1 Qualitative Methods
Qualitative methods focus on noncategorical, observational, or narrative data. Evaluation using qualitative methods is primarily inductive, in that data are collected and examined for patterns. These patterns are then used to make generalizations and formulate hypotheses based on these generalizations. Qualitative methods include interviews and focus groups, observations, some types of existing data, portfolios, and case studies. Each method is described in the following paragraphs.
1688421 – SAGE Publications, Inc. (US) ©
Interviews and focus groups (qualitative) are typically conducted face-to-face or over the phone. We also conduct individual interviews using video conferencing software. Focus groups are group interviews and can also be conducted using video conferencing software, but I have found it is difficult to maintain the richness of discussion found in face-to-face focus groups when conducted using video. However, I have no doubt as we become more skilled with facilitating group discussions where individuals are in varied locations, video focus groups will become an important and invaluable mode of research. The list of interview and focus group questions is referred to as a protocol; an interview protocol can be created with questions to address your specific information needs. The interviewer can use follow-up questions and probes as necessary to clarify responses. However, interviews and focus groups take time to conduct and analyze. Due to the time-consuming nature of interviews and focus groups, sample sizes are typically small, and research costs can be expensive. See Interviews in Qualitative Research (King, Horrocks, & Brooks, 2018) and Focus Groups (Krueger & Casey, 2014) for more information on designing and conducting interviews and focus groups.
Observations (usually qualitative but can be quantitative) can be used to collect information about people’s behavior, such as teacher’s classroom instruction or students’ active engagement. Observations can be scored using a rubric or through theme-based analyses, and multiple observations are necessary to ensure that findings are grounded. Because of this, observational techniques tend to be time-consuming and expensive, but can provide an extremely rich description of program implementation. See the observation section of Robert Wood Johnson Foundation’s Qualitative Research Guidelines Project (Cohen & Crabtree, 2006) for more information and a list of resources on using observation in research.
Existing data (usually quantitative but can be qualitative) are often overlooked but can be an excellent and readily available source of evaluation information. Using existing data such as school records (e.g., student grades, test scores, graduation rate, truancy data, and behavioral infractions), work samples, and lesson plans, as well as documentation regarding school or district policy and procedures, minimizes the data collection burden. However, despite the availability and convenience, you should critically examine the quality of existing data and whether they meet your evaluation needs.
Portfolios (typically qualitative) are collections of work samples and can be used to examine the progress of the program’s participants throughout the program’s operation. Work samples from before (pre) and after (post) program implementation can be compared and scored using rubrics to measure growth. Portfolios can show tangible and powerful evidence of growth and can be used as concrete examples when reporting program results. However, scoring can be subjective and is highly dependent upon the strength of the rubric and the training of the portfolio scorers; in addition, the use of rubrics in research can be very resource intensive (Herman & Winters, 1994).
Case studies (mostly qualitative but can include quantitative data) are in-depth examinations of a person, group of people, or context. Case studies can include a combination of any of the methods reviewed above. Case studies look at the big picture and investigate the interrelationships among data. For instance, a case study of a school might include interviews with teachers and parents, observations in the classroom, student surveys, student work, and test scores. Combining many methods into a case study can provide a rich picture of how a program is used, where a program might be improved, and any variation in findings from using different methods. Using multiple, mixed methods in an evaluation allows for a deeper understanding of a program, as well as a more accurate picture of how a program operates and its successes. See Yin (2017) for more information on case study research.
8.2.2 Quantitative Methods
Quantitative methods focus on categorical or numerical data. Evaluation based on quantitative data is primarily deductive, in that it begins with a hypothesis and uses the data to make specific conclusions. Quantitative methods include assessments and tests, as well as surveys and questionnaires, and some types of existing data. Each method is described in the following paragraphs.
Assessments and tests (typically quantitative but can include qualitative items) are often used prior to program implementation (pre) and again at program completion (post), or at various times during program implementation, to assess program progress and results. Assessments are also referred to as tests or instruments. Results of assessments are usually objective, and multiple items can be used in combination to create a subscale, often providing a more reliable estimate than any single item (see Wright, 2007). If a program is intended to decrease depression or improve self-confidence, you will likely want to use an existing assessment that measures depression or self-confidence. If you want to measure knowledge of organizational policies, you may decide to create a test based on the policies specific to the organization.
1688421 – SAGE Publications, Inc. (US) ©
However, before using assessment or test data, you should be sure that the assessment adequately addresses what the program intends to achieve. You would not want the success or failure of the program to be determined by an assessment that does not accurately measure the program’s outcomes.
The reliability and validity of an instrument are important considerations when selecting and using instruments such as assessments and tests (as well surveys and questionnaires). Reliability is the consistency with which an instrument measures whatever it intends to measure. There are three common types of reliability: internal consistency reliability, test– retest reliability, and inter-rater reliability. See Figure 8.2 for a description of each type of reliability.
Reliability: the consistency with which an instrument measures something.
Validity is the accuracy with which an instrument measures a construct. The construct might be anxiety, aptitude, achievement, alcoholism, or self-confidence. There are four types of validity: content validity, construct validity, criterion-related validity, and consequential validity. See Figure 8.2 for more information on each type of validity.
Validity: the accuracy with which an instrument measures a construct.
1688421 – SAGE Publications, Inc. (US) ©
Figure 8.2 Reliability and Validity
When choosing an assessment or creating your own instrument, you should investigate the technical qualities of reliability and validity to be sure the test is consistent in its measurement and to verify that it does indeed measure what you need to measure. Further, taking a subset of items from a validated instrument to create a new instrument does in fact create a new instrument, with untested reliability and validity. Results from an instrument that is not valid are, in turn, not valid. That is, using an instrument that has not been validated through the examination of reliability and validity can result in erroneous and costly decisions being made based upon those data.
Surveys and questionnaires (typically quantitative but can include qualitative items) are often used to collect information from large numbers of respondents. They can be administered online, on paper, in person, or over the phone. In order for surveys to provide useful information, the questions must be worded clearly and succinctly. Survey items can be open-ended or closed-ended.
Open-ended survey items allow respondents to provide free-form responses to questions and are typically scored using a rubric. A rubric is a scoring guide used to categorize text-based or observational information based upon set criteria or elements of performance. See Figure 8.3 for more information on rubrics. Closed-ended items give the respondent a choice of responses, often on a scale from 1 to 4 or 1 to 5. Surveys can be quickly administered, are usually easy to analyze, and can be adapted to fit specific situations.
Rubric: a guideline that can be used objectively to examine subjective data.
Building a survey in conjunction with other methods and tools can help you to understand your findings better. For instance, designing a survey to explore findings from observations or document reviews can enable you to compare findings among multiple sources. Validating
1688421 – SAGE Publications, Inc. (US) ©
your findings using multiple methods gives the evaluator more confidence regarding evaluation findings.
Figure 8.3 Scoring Rubrics
Using a previously administered survey can save you time, may give you something to compare your results to (if previous results are available), and may give you confidence that some of the potential problems have already been addressed. Two notes of caution, however, in using surveys that others have developed: (1) Be sure the instrument has been tested and demonstrated to be reliable and valid for the intended population, and (2) be sure the survey addresses your evaluation needs. It is tempting to use an already developed survey without thinking critically about whether it will truly answer your evaluation questions. Existing surveys may need to be adapted to fit your specific needs.
See Survey Research Methods (Fowler, 2013) for more information on designing, administering, and analyzing surveys.
8.2.3 Mixed Methods
Mixed-method studies combine both qualitative and quantitative methods. For example, an evaluation of a program intended to increase the retention of faculty from underrepresented groups in the STEM fields (science, technology, engineering, and math) might
1688421 – SAGE Publications, Inc. (US) ©