Can AI chatbots support CVE evaluations?

Check out this new article by Irina van der Vet and Leena Malkki: Copilot in Service: Exploring the Potential of Large Language Model-Based Chatbots for Fostering Evaluation Culture in Preventing and Countering Violent Extremism.

This study explores whether large language model (LLM)-based chatbots, specifically Microsoft Copilot, can support professionals working in preventing and countering violent extremism (P/CVE). The research focuses on whether chatbots can function as recommender systems to assist practitioners with evaluation processes, increase their knowledge of evidence-based evaluation, and provide practical guidance while acknowledging the inherent limitations of such technology.

The study examined Copilot’s responses to 50 pre-designed prompts related to evaluation in the P/CVE field. The qualitative analysis assessed three key criteria:

  • Accuracy and reliability – Whether Copilot’s responses were factually correct and relevant.
  • Relevance and integrity – Whether the responses addressed specific evaluation needs in the P/CVE context.
  • Readability and comprehensibility – Whether the responses were structured and accessible for practitioners with varying levels of expertise.

Copilot effectively generated responses that were mostly accurate, well-structured, and relevant to evaluation in P/CVE. The chatbot provided practical recommendations and step-by-step guidance, making it a useful tool for initiating and deepening knowledge of evidence-based evaluation. Copilot was capable of offering emotional support and encouragement, which may help professionals overcome hesitations around evaluation. Some limitations were identified, including occasional biases, data security concerns, and a lack of citations for sources, requiring professional oversight when using chatbots for decision-making. Scenario-based queries showed that while Copilot could provide useful general recommendations, it struggled to tailor responses to specific geographical and organisational contexts.

LLM-based chatbots have the potential to strengthen evaluation culture in P/CVE by offering accessible guidance and promoting knowledge-sharing. The article proposes that policymakers and organisations should consider integrating AI chatbots into training and capacity-building initiatives but ensure professional oversight to mitigate risks related to bias and misinformation.

Further research is needed to enhance chatbot reliability, particularly by refining their ability to handle sensitive data and tailoring responses to specific practitioner needs. Ethical concerns, including data privacy and bias mitigation, should be addressed before integrating AI-based tools into critical decision-making processes in P/CVE.

This study highlights the growing role of AI-driven tools in professional practice, demonstrating both their potential benefits and the need for careful implementation in complex fields like P/CVE.