Why are control and comparison groups important in impact evaluations?
By Matteo Vergani
Introduction
This module is intended to assist governmental and non-governmental stakeholders in building a shared culture about understanding the impact evaluations of programs aiming at tackling hatred and violent extremism. For step-by-step guides to conducting evaluations in the field of countering violent extremism, we recommend the following resources.
They offer an important resource for practitioners who need a practical guide to conducting evaluations. However, sometimes they describe similar concepts in slightly different terms, which might cause confusion among readers. Also, they take for granted key cultural underpinnings about what stakeholders think about impact evaluations and about what is realistic to expect from them in the field of tackling hatred and extremism. In this module, we provide a combination of theoretical and practical knowledge to complement the use of these resources.
In this module
In this module, we will summarise theoretical notions and arguments used in the literature to explain how to design impact evaluations of programs aiming at tackling hatred and violent extremism. We will discuss definitions and the usefulness of impact evaluations, and we will look at how to develop a basic program logic, how to design an impact evaluation and how to make sense of the data. The final section contains a small library of measurement tools, which is particularly important because we imagine it as a first step towards building cumulative knowledge about what works in different local contexts. |
What are impact evaluations and why are they useful?
Evaluations are structured analyses of quantitative and/or qualitative data to answer questions using clear criteria and standards about the effectiveness, efficiency or appropriateness of intervention programs (to know more about what intervention programs are, check out this resource).
The term evaluation usually includes a broad range of activities, from internal reviews of individual activities to formal program evaluations and to whole-of-government strategy evaluations. In this module, we look at impact evaluations, which examine whether programs achieved their aims and results and whether they affected the desired change in the target populations.
Check out the following resources to get some ideas about the methods and tools that can help you in thinking about how to evaluate the impact of a program aiming at tackling hatred and violent extremism.
In the next video, you’ll find a discussion of the different types of evaluations, and an explanation about why impact evaluations are of vital importance.
Is our program reducing violent extremism and hatred?
How do we know if our program is reducing the overall levels of hatred and violent extremism? The reality is that it is very hard, often impossible, to measure trends in hatred and violent extremism at a certain time and in a certain space because of gaps in data collection and the measurement of these phenomena. An alternative approach to thinking about the impact of our program is to do the following:
The following video explains how to uncover the assumption of a program aiming at tackling hatred and violent extremism.
Drawing a program logic
Drawing a program logic means identifying the specific and measurable aims of your program. Program logics are graphical descriptions of what the objectives of the program are and how the objectives will be achieved. It is essential to know what the goals of the program are in order to be able to assess whether the program has achieved them. It makes sense intuitively: if we do not know where we wanted to go, it is impossible to assess whether we have gone to the right place and if we have taken the best route to the target.
First, program logics need to specify outcomes: for example, an education program about Indigenous Australians might counter negative stereotypes to decrease levels of prejudice and racism. Decreasing levels of prejudice and racism would be the outcome of the program. These two elements (prejudice and racism) must be measurable, for example, by collecting questionnaire data from among the people who attend the education program and by asking questions about their bias and racism towards Indigenous Australians.
Second, program logics have to detail the resources needed for the program. Where will the education program be conducted? In which physical spaces? How many staff members and what equipment will be needed? And how much funding is needed for these resources?
Third, program logics need to list all the activities included in the program. For example, you might plan to conduct five face-to-face workshops in a classroom, then you might want to show a movie about Indigenous Australians and discuss the movie with the director and, finally, you might want to organise a heritage walk and learn about a traditional land ceremony with an Indigenous guide. All these activities must be listed in the program logic.
Finally, it is very important to specify the target population of your program, especially if you will use your program logic to design the program’s impact evaluation. What is the age group? Are there any quotas based on gender or particular ethnicities? For example, you might want to target young people aged 18 to 25, 50% females and 50% Australian-born.
There are different opinions regarding the key components of program logics. We suggest exploring the four evaluation toolkits displayed at the beginning of this module and checking the different types of components they include in their program logics. Interestingly, you will notice that they are all different. In this module, we suggest a simplified version of a program logic with only four components: outcomes, resources, activities and target population. They will be the backbone of your program evaluation.
Once the components of the program logic have been identified, it is necessary to draw logical arrows between them and to connect them in a “chain of causality”. Each element within each component (for example, each activity, each resource, each group in the target population, each outcome) needs to be connected to another with logical arrows, as in the example below. The result will be a synthetic graphical description of how the program is supposed to work and of how its objectives will be achieved. Developing a program logic model is useful for many reasons: it helps to clearly communicate the project, it is a good management tool and it helps in replicating the programs. And, most importantly for us, it allows the evaluation of the impact of a program.
Please find here a few useful resources to guide you in developing a program logic, provided by:
Designing the evaluation: pre/post measurement and control groups
Evaluations should always consider, when possible, collecting data from program participants before the program starts and after the program ends. This is very important because it allows you to detect whether your program caused any changes among the participants. This method is commonly referred to as pre/post intervention impact evaluation. When possible, we also suggest always collecting pre and post intervention data from a group that did not participate in the program but that is similar in composition to the participating group (also called the “intervention group”). This group of people that is not participating in the program is called the control (or comparison) group.
This recent report provides a great example of evaluations of community projects aiming at addressing hate crime in the UK. It describes how sensitive content was delivered, what evaluation methods and designs were used, and provides and overview of the results.
The next video explains why it’s so important to have control groups in impact evaluations.
Making sense of the data
Impact evaluations aim to capture whether the program achieved the desired change in the target population. Going back to the program logic, you need to have your outcomes clearly framed before you decide how you want to measure them. For example, if your objective is to reduce prejudice towards Indigenous Australians, you will have to measure levels of prejudice among your target population. How do you do that?
Broadly, we can identify two types of outcome measurement tools: qualitative and quantitative. Qualitative tools aim to understand why there was (or was not) a change in the perception of the subject. Interviews and focus groups are the most common qualitative tools. Quantitative tools aim to measure average levels of attitudes and behaviours among your target population. To do this, most researchers use Likert scales for analysing answers, and they often have 5, 7 or 10 points that are used to allow the respondent to express how much they agree or disagree with a particular statement.
You do not have to come up with your own statements: many researchers before you have already examined the best way to ask questions, and they have created lists of statements that can help you to measure prejudice towards out-groups, support for violence, alienation, meaninglessness, a lack of agency, resilience to violent extremism and many other important constructs. You can find a small list of these scales at the end of this module.
The next video has some tips for collecting data (especially quantitative data) for the purpose of assessing the impact of a program aiming at tackling hatred and violent extremism.
Once you have collected the data, you need to analyse them. To analyse interviews and focus group data, the easiest way is to identify the recurring themes in what participants say. Themes are recurring sentences, or ideas, that are associated with a specific outcome measure. To analyse pre/post intervention quantitative data, the easiest way is to determine the levels of your outcome measure before and after the intervention and to check whether there was any change. The participants in your program and the comparison group must be assessed separately. It is always better to ask a researcher to analyse the data. Collaboration between universities and community organizations can open up great opportunities for researchers, who will be able to apply their knowledge to real-world problems and to use new data for teaching and research purposes (when possible), as well as chances for community organizations to be able to evaluate with more precision the impact of their programs.
Existing toolkits provide some instructions to analyse data, for example Appendix B of RAND’s toolkit at p.105, titled “Analyze Your Program’s Evaluation Data”.
A library of tools to measure hatred and violent extremism related outcomes
In order to build cumulative knowledge and improve understanding about what works in preventing violent extremism, it is necessary to accumulate evidence about what works and what does not work in specific local contexts. Policymakers and practitioners need this evidence to inform their decisions and the allocation of (often scarce) resources. However, the current lack of consistency in the measurement tools used to assess the impact of programs prevents us from being able to accurately compare outcomes and understand what works, where and when.
It is crucial to establish a common set of resources to assess the impact of programs addressing hatred and violent extremism. These resources are meant to assess attributes that in the literature are considered risk factors of hatred and violent extremism, which are often the outcome measures of prevention programs.
Here, we provide a short list of measurement tools. We welcome the opportunity to expand this library with your suggestions.
Conspiratorial mindset
Critical thinking
Moral disengagement
Negative views to out-groups
Psychological vulnerabilities
Religious fundamentalism
Resilience
Support for extremist views
Support for engagement in violent political action
Trust