How to Implement Advanced RAG Using AI Search in Copilot Studio: A CRAG Tutorial

Having previously built a “Self-RAG” in Copilot Studio, this time I’ll implement “CRAG.”

*For reference, see my previous article

Implementing Self-RAG with Copilot Studio: Advanced RAG Techniques for Better AI Responses

Following our previous implementation of Naive RAG in Copilot Studio, this time we'll explore "Self-RAG," one of the Adv...

CRAG
Building CRAG in Copilot Studio
Implementation
Testing the Implementation
Articles from the Last Two Sessions
Copilot Studio Features Used in This Article

CRAG

CRAG (Corrective Retrieval Augmented Generation) is a RAG technique proposed around February 2024 that offers the advantage of reducing hallucinations compared to traditional RAG (Retrieval Augmented Generation).
The key characteristic of CRAG is the use of a Retrieval Evaluator to assess the relevance between retrieved documents and the user’s question.

This retrieval evaluation classifies documents into three categories: “Correct (highly relevant),” “Incorrect (low relevance),” and “Ambiguous (difficult to determine),” and takes the following actions for each:

Correct (highly relevant): Generate responses using the documents
Incorrect (low relevance): Generate responses from web searches without using the retrieved documents
Ambiguous (difficult to determine): Generate responses using both the retrieved documents and web search results

Building CRAG in Copilot Studio

Here’s the flow we’ll be implementing. We’ll use GPT4o for search evaluation and SerpAPI for web searches.

Implementation

Since our goal is simply to demonstrate how to build CRAG in Copilot Studio, we won’t focus on optimizing accuracy (such as refining prompts).

SharePoint Search

First, we’ll search for documents from SharePoint.

We’ll use a prompt action to convert the user’s question into search terms.

Screenshot showing prompt action configuration for converting questions to search terms

The conversion prompt is reused from the previous example.
*Note: “Excluded Keywords” remains here only because we reused the prompt from the previous (Self-RAG) implementation. It’s not actually necessary for this purpose.

Screenshot showing the conversion prompt configuration with excluded keywords field

Here are examples of user questions and their corresponding search queries. Use these examples to guide your transformation of the input question.
Example Input 1: "How to connect to a database with Python?"
Example Output: "Python database connection method"
Example Input 2: "What impact does climate change have on ecosystems?"
Example Output: "Climate change ecosystem impact"

Additionally, if certain words are specified as "excluded keywords," ensure that these words are NOT included in the generated search query.
For example:
Excluded Keywords: ["Python connection"]
Example Input: "How to connect to a database with Python?"
Example Output: "Database connection method"

Now, based on these examples and the excluded keywords, transform the following user question into an effective search query, avoiding the specified excluded keywords.
Input: {input}
Excluded Keywords: {excluded_keywords}

Using the created search terms, we’ll search for SharePoint documents from AI Search.

Screenshot showing AI Search configuration to search SharePoint documents

Screenshot showing additional AI Search configuration settings

For information on how to set up AI Search, see here.

How to Configure SharePoint Document Libraries in Azure AI Search (Preview)

A step-by-step guide on integrating SharePoint with Azure AI Search.Adding SharePoint Online as a Data Source to AI Sear...

Document Relevance Evaluation

Next, we’ll score the relevance between the searched documents and the user’s question.

First, execute a Foreach loop for all documents in the search results,

Screenshot showing Foreach loop configuration for processing search results

Within the loop, use a prompt action to determine the relevance score (0.0 to 1.0) for each document.

Screenshot showing prompt action configuration for relevance scoring inside the loop

Here’s what the prompt looks like.

Screenshot showing the relevance scoring prompt template

# Task 
Evaluate relevance between user question and retrieved document on a 0.0-1.0 scale

# Evaluation Criteria
## 1.0 - Document provides complete & direct answer
(Example: Contains specific numbers/dates/names matching query)
## 0.7-0.9 - Directly relevant but requires:
- Context synthesis OR
- Partial information extraction OR
- Terminology clarification
## 0.4-0.6 - Partial relevance through:
- Shared domain knowledge
- Indirect supporting evidence
- Related concepts without direct answer
## 0.1-0.3 - Barely relevant with only:
- Common keywords
- Generic domain overlap
- Indirect conceptual connections
## 0.0 - No semantic/contextual connection

# Output Format
- Relevance Score: [Strictly 0.0-1.0 numeric value only]

# Input Data
Query: {question}
Retrieved Document:
{document}

Finally, create an object array “Documents” with a new column “Score” added to the searched documents.

Screenshot showing the creation of a Documents array with score column

Checking Relevance Scores

Once the relevance score evaluation is complete, get the maximum score and determine whether to perform a web search or generate an answer directly.

Diagram showing the relevance score checking process and decision flow

Use the Max function to get the maximum value from the “score” column in the Documents we created earlier,

Screenshot showing the Max function used to find the highest relevance score

And use a condition to branch subsequent processing.

Screenshot showing the conditional branching based on relevance score

Case 1: Maximum Score is 0.8 or Higher →Generate Response from Highly Relevant Documents

If the maximum score is 0.8 or higher, extract only the highly relevant documents (those with scores of 0.8 or higher) and generate the response.

Diagram showing the flow for generating responses from highly relevant documents

From the Documents object array, use the Filter function to extract documents with scores of 0.8 or higher, and assign them to the variable “k_in”,

Screenshot showing the Filter function to extract documents with scores of 0.8 or higher

Then use this “k_in” as an argument when generating the response.

Screenshot showing k_in being used as an argument for response generation

Case 2: Maximum Score is Below 0.8 →Generate Response from Web Search and Moderately Relevant Documents

If the maximum score is less than 0.8, perform either “Ambiguous” (documents with scores of 0.4 or higher + Web search) or “Incorrect” (Web search only) processing.

Diagram showing the flow for generating responses when document relevance is low or ambiguous

First, extract documents with scores of 0.4 or higher and assign them to the variable “k_in”.
*Note: In the “Incorrect” case (Web search only), k_in will be empty.

Screenshot showing the extraction of documents with scores of 0.4 or higher

Next, perform a Web search (SerpAPI call) via Power Automate.

Screenshot showing Power Automate configuration for calling SerpAPI

SerpAPI is an API that can perform Web searches. In this case, we’ll use “related_question” and “organic_results” from the obtained information as RAG targets.
*Note: Scraping the sites obtained from web searches would increase accuracy, but we’ll skip that step here.
*Note: Ideally, we would regenerate search terms specifically for web searches, but we’ll skip that as well.

Screenshot showing SerpAPI response with related_question and organic_results

Use both k_in and the web search results as arguments for response generation.

Screenshot showing both k_in and web search results being used as arguments for response generation

Answer Generation

At the end, use a prompt action to run RAG with GPT-4o, using the information obtained so far as arguments.

If the query cannot be answered based on the provided context, respond with 'The provided information is not sufficient to answer this question.'  
Otherwise, generate a complete and accurate answer, and as an annotation, output the name of the referenced files at the end.

# query
 {query}

# information
 {information}

Testing the Implementation

First, let’s try asking a question that even simple RAG (Naive RAG) could answer,

Screenshot showing a test question that simple RAG could answer

A document with a relevance score of 1 (maximum) is found,

Screenshot showing a document with maximum relevance score of 1

And an answer is generated from this document.

Screenshot showing the answer generated from the highly relevant document

Next, let’s try asking a question that simple RAG couldn’t answer,

Screenshot showing a test question that simple RAG couldn't answer

Although the maximum score is 0, it still generates a correct answer.

Screenshot showing correct answer generated despite maximum relevance score of 0

This is because the web search results contain the correct information, demonstrating the benefit of CRAG in reducing hallucinations.

Screenshot showing web search results containing the correct information

So we’ve confirmed that CRAG successfully improves accuracy.

Since this solution relies on “web search,” it may not improve response accuracy for RAG systems searching internal documents, but it’s worth remembering this approach as it might be useful in certain scenarios.

Articles from the Last Two Sessions

Building a Basic RAG System with Copilot Studio AI Search: A Step-by-Step Guide

Building upon our previous work with AI Search, I implemented a Naive RAG (Retrieval-Augmented Generation) system in Cop...

Implementing Self-RAG with Copilot Studio: Advanced RAG Techniques for Better AI Responses

Following our previous implementation of Naive RAG in Copilot Studio, this time we'll explore "Self-RAG," one of the Adv...

Copilot Studio Features Used in This Article

How to Configure SharePoint Document Libraries in Azure AI Search (Preview)

A step-by-step guide on integrating SharePoint with Azure AI Search.Adding SharePoint Online as a Data Source to AI Sear...

How to Use SharePoint Knowledge Base in Copilot Studio for RAG Implementation

As I begin learning about Copilot Studio, I'm documenting my findings.In this article, I'll explain how to generate resp...

How to Use Prompt Actions with AI Builder in Copilot Studio: Complete Tutorial for Generating AI Responses

This article explains how to use AI Builder's "Prompts" from Copilot Studio.Using Generative AI from Copilot Studio Ther...

Mastering Power Fx in Copilot Studio: Using Variables in Formulas and Key Functions

How to use topic variables and global variables within formulas in Copilot Studio.Formulas (Power Fx Expressions) In Cop...

AI Builder 「GPTでプロンプトを使用してテキストを作成する」に、構造化出力（JSON出力）が追加された

AIBuilderに構造化出力のような機能が追加されたので試してみた。JSON出力とはJSON出力（構造化出力）は、ざっくりいうと「LLMの回答を、指定したJSON形式のデータに指定する」こと。今までは以下のようなプロンプトで出力を指定して...