Following our previous implementation of Naive RAG in Copilot Studio, this time we’ll explore “Self-RAG,” one of the Advanced RAG techniques.
data:image/s3,"s3://crabby-images/1ed54/1ed54b336ca4be3b7e5f54ea7af0475b8d23775f" alt=""
Self-RAG
Self-RAG is a methodology developed around October 2023, designed to improve response quality and reduce hallucinations.
- Determine whether information retrieval is necessary (if not, generate response directly)
- If retrieval is needed, fetch multiple documents and evaluate their relevance to the question
- Generate responses based on relevant documents
- Evaluate each response and synthesize the final answer
Ideally, Self-RAG involves fine-tuning LLMs to create separate “critic” and “generator” models. However, since that level of customization isn’t feasible in our context, we’ll utilize GPT4o for all these functions.
Implementing Self-RAG in Copilot Studio
data:image/s3,"s3://crabby-images/1e51a/1e51a936630ed945a56698b0f0452667da57d724" alt="Diagram showing Self-RAG implementation workflow in Copilot Studio"
Implementation
Since our main focus is on building Self-RAG in Copilot Studio, we’ll prioritize the implementation over optimizing accuracy (such as prompt engineering).
Trigger and Variable Declaration
data:image/s3,"s3://crabby-images/e821c/e821c9296112efebb6cb31ed0e7ee189208ea676" alt="Screenshot showing trigger setup and variable declaration"
Determining Search Necessity
data:image/s3,"s3://crabby-images/17ea3/17ea33a3a4791d1ca1ac44e835fb8e1e495ad414" alt="Screenshot showing search necessity evaluation flow"
data:image/s3,"s3://crabby-images/5f446/5f446971af9300e9feb6556fc20593706e583200" alt="Screenshot showing prompt action configuration"
data:image/s3,"s3://crabby-images/b79e9/b79e9094207f4b1a08aa4b13fcac57b8a7c28542" alt="Screenshot showing prompt settings and model selection"
Your task is to determine whether a user's question requires external knowledge retrieval or not. Use the following criteria to make your decision: ### Decision Criteria: 1. **No retrieval required**: - If the question can be answered confidently using only your pre-existing knowledge, classify it as "no retrieval required." - Examples: Definitions, general knowledge, basic calculations, or simple reasoning tasks. 2. **Retrieval required**: - Classify the question as "retrieval required" if it meets any of the following conditions: - The question requires up-to-date information (e.g., recent events or news). - The question relates to specific domain knowledge (e.g., legal, medical, or technical details) that may not be fully covered by your internal knowledge. - The question explicitly references external resources (e.g., specific documents, websites, or datasets). - Your internal knowledge alone is insufficient to provide a comprehensive or accurate answer. ### Output Format: Provide your answer in the following format: - **"search_required": "yes"** (if retrieval is needed) - **"search_required": "no"** (if retrieval is not needed) Here is the user's question: Question: {question} Respond with the required output format only, without any additional explanation or context.
data:image/s3,"s3://crabby-images/8f0db/8f0dbfd7d5d4c1d6d8c0876cf017864677573234" alt="Screenshot showing direct response generation flow"
data:image/s3,"s3://crabby-images/88dcd/88dcd8f2e3b82c2ad0a651af81624a69ed5f73e9" alt="Screenshot showing search query generation"
data:image/s3,"s3://crabby-images/c5bfd/c5bfd5b3b5a1f3f12dd0224583874de772adae89" alt="Screenshot showing AI Search implementation"
data:image/s3,"s3://crabby-images/2fc18/2fc1828c1061a30647b5360afb3c0822421e819b" alt="Screenshot showing Retrieval topic configuration"
Relevance Evaluation
data:image/s3,"s3://crabby-images/527fb/527fb804be100b23bef198c85854feeca2ad4c51" alt="Screenshot showing relevance evaluation flow"
data:image/s3,"s3://crabby-images/fbd40/fbd409ef97aecdf61412bd4ee16ac6368b36da08" alt="Screenshot showing prompt configuration for relevance evaluation"
You are an evaluator tasked with determining the relevance of a retrieved document to a user question. This assessment does not require overly strict criteria, but the goal is to exclude clearly irrelevant documents. If the document directly answers the user question, provides supporting information, or includes keywords/semantic meaning clearly related to the question, grade it as relevant. If the document is unrelated, off-topic, or too vague to establish a clear connection to the user question, grade it as not relevant. Respond with a binary score: Output yes if the document is relevant. Output no if the document is irrelevant. # user question : {question} # documents : {docs}
data:image/s3,"s3://crabby-images/1573c/1573c5f315a5c3ec78f95b6bca13056077534759" alt="Screenshot showing handling of irrelevant document cases"
If relevance is confirmed, proceed to response generation.
Response Generation and Answer Validation
data:image/s3,"s3://crabby-images/eb3f6/eb3f62903968f47cffd9d9fae206c5cd940ee43d" alt="Screenshot showing response generation and validation flow"
data:image/s3,"s3://crabby-images/388cd/388cdb0c7fd4eec1c5566e7bdf6cc3826a828395" alt="Screenshot showing response generation configuration"
data:image/s3,"s3://crabby-images/e65f2/e65f275353b0bdc9826d968fea1a27e35839fe99" alt="Screenshot showing response evaluation setup"
data:image/s3,"s3://crabby-images/2b8b1/2b8b1d32c8b79da9485fb68320d098d2f7615179" alt="Screenshot showing prompt configuration for response evaluation"
You are an evaluator tasked with assessing whether a generated answer appropriately addresses or resolves a user's question. If the answer directly resolves the question, provides accurate and sufficient information, or effectively addresses the intent behind the question, grade it as yes. If the answer is incomplete, vague, inaccurate, off-topic, or fails to address the intent of the user question, grade it as no. Respond with a binary score: Output yes if the answer resolves the question. Output no if the answer does not resolve the question. # User question : {question} # LLM generation answer : {generation}
data:image/s3,"s3://crabby-images/918ee/918eeda9ce2cd5e2969fede88ad211b30f3bd669" alt="Screenshot showing relevant response handling"
data:image/s3,"s3://crabby-images/8e9d7/8e9d7d2bcc59e04158730ba4407d668e8fb243df" alt="Screenshot showing handling of irrelevant responses"
This completes the topic implementation.
Optional: Integration with Conversational Boosting
data:image/s3,"s3://crabby-images/43ce9/43ce964f1c624b767135434f7b243fc53aebc906" alt="Screenshot showing Conversational boosting integration"
Testing Results
data:image/s3,"s3://crabby-images/39480/3948008b54c7f2b98d4b603f0c44e23c7a350fc2" alt="Screenshot showing successful response to previously answered question"
data:image/s3,"s3://crabby-images/d1515/d1515764d5238b407ded8e7fd76efbcaa449b41a" alt="Screenshot showing multiple search iterations"
data:image/s3,"s3://crabby-images/cc955/cc9557df39cae335692ae44f65ade9a548d68d52" alt="Screenshot showing successful response generation"
data:image/s3,"s3://crabby-images/0daac/0daac769cb828a51ec65ddce10f68cc45cad4be6" alt="Screenshot showing successful handling of non-search questions"
These results confirm the improved accuracy of our implementation. In the next article, I’d like to experiment with CRAG and other advanced techniques.
data:image/s3,"s3://crabby-images/1ed54/1ed54b336ca4be3b7e5f54ea7af0475b8d23775f" alt=""
コメント