Building a Basic RAG System with Copilot Studio AI Search: A Step-by-Step Guide

Building upon our previous work with AI Search, I implemented a Naive RAG (Retrieval-Augmented Generation) system in Copilot Studio.

スポンサーリンク

Naive RAG

Naive RAG is the most fundamental approach to combining information retrieval with generative models. It follows a simple two-step process:

  1. Retrieve relevant information from external databases or documents based on user queries
  2. Generate responses using a language model (such as GPT) based on the retrieved information

While Advanced RAG techniques and Agentic Workflow/Agentic Design Patterns are becoming increasingly popular, I decided to start with the basic Naive RAG implementation as a learning exercise.

Tip: Official Recommendation for Using Knowledge Feature

The official AI Search documentation specifically recommends using Copilot Studio’s built-in Knowledge feature when implementing RAG with SharePoint as a data source.

If you need to create a custom Copilot / RAG (Retrieval Augmented Generation) application to chat with SharePoint data, the recommended approach is to use Microsoft Copilot Studio instead of this preview feature.
Excerpt from official documentation

While I’m building a custom implementation for this experiment, for production environments, a practical approach would be to first try the Knowledge feature, and only develop a custom RAG solution if the accuracy doesn’t meet your requirements.

Implementation

Here’s an overview of the system architecture:
In SharePoint Online (SPO), we’ve stored pre-chunked, relatively recent information extracted from the Wiki.
Note: We use recent information because using older data wouldn’t be interesting – the LLM could answer queries using its own knowledge base.
Screenshot showing SharePoint data structure and chunking
For details on data preparation and AI Search integration, please refer to:
How to Configure SharePoint Document Libraries in Azure AI Search (Preview)
A step-by-step guide on integrating SharePoint with Azure AI Search.Adding SharePoint Online as a Data Source to AI Sear...

The implementation follows these steps:

  1. Create topics and generate search keywords
  2. Perform search (AI Search call)
  3. Generate responses and send messages
  4. Optional: Invoke from Conversational boosting

Step 1: Create Topics and Generate Search Keywords

First, create a new topic and implement search keyword generation using a “Prompt Action.”
Screenshot showing topic creation and prompt action setup
The prompt uses few-shot learning with JSON output formatting to make the response easily usable in subsequent nodes.
Note: Since our main goal here is to implement a basic Naive RAG system, we’re not focusing on optimizing accuracy at this stage.
Screenshot showing prompt configuration with few-shot examples

Here are examples of user questions and their corresponding search queries.
Use these examples to guide your transformation of the input question.

Example Input 1: "How to connect to a database using Python?"
Example Output: "Python database connection method" 
Example Input 2: "What are the effects of climate change on ecosystems?"
Example Output: "climate change ecosystem impact"
Example Input 3: "What are the ethical challenges of AI?"
Example Output: "AI ethics challenges" 

Now, based on these examples, transform the following user question into an effective search query.

Input: {input} 

Step 2: Search (AI Search Call)

Next, we’ll call AI Search directly via HTTP request without using Power Automate as an intermediary.
Screenshot showing AI Search HTTP request configuration

// POST to the following URL
https://[AI-Search-Resource-Name].search.windows.net/indexes/[Index-Name]/docs/search?api-version=2024-11-01-preview
The headers include the API key and Content-type.
Note: This is a sample implementation, so we’re handling the API key in a simplified manner.
Screenshot showing header configuration with API key
For the body, select “Raw Content” and create the request body using the JSON function.
Note: We’ll generate responses from the top 3 search results.
Screenshot showing JSON body configuration

JSON({
    search:Topic.queryOutput.structuredOutput.query, 
    count:true, 
    top:Topic.Top // Variable "Top" is preset to 3 
})
Finally, define the response schema. This makes it easier to use the response in subsequent nodes.
Note: The schema definition varies depending on your AI Search index – it’s easiest to generate it from a sample response.

Screenshot showing response schema definition

kind: Record
properties:
  @odata.context: String
  @odata.count: Number
  value:
    type:
      kind: Table
      properties:
        @search.score: Number
        content: String
        id: String
        metadata_spo_item_content_type: String
        metadata_spo_item_last_modified: String
        metadata_spo_item_name: String
        metadata_spo_item_path: String
        metadata_spo_item_size: Number

Step 3: Generate Response and Send Message

Pass the “user question” and “search results” to a “Prompt Action” to generate the response.
Screenshot showing prompt action configuration for response generation
// Pass the "file name" and "file content" as search results 
JSON(
    ForAll(Topic.search_result.value,
    {
        fileName:metadata_spo_item_name,content:content
    }
))
Here’s the content of the prompt action. We’ll use JSON output format here as well for easier handling of the response.
Screenshot showing prompt configuration with JSON output format
If the query cannot be answered based on the provided context, respond with 'Unable to answer this question with the provided information.'
Otherwise, generate a complete and accurate answer, and as an annotation, output the name of the referenced files at the end.

Step 4 (Optional): Invoke from Conversational Boosting

Finally, since I plan to experiment with various RAG methods in the future, I’ll set up Conversational boosting to handle user messages.

When a message is received, let users select their preferred RAG method:
Screenshot showing RAG method selection interface
Then redirect based on their selection:
Screenshot showing redirection flow based on RAG method selection

This completes the implementation.

Testing the Implementation

When we test the system with sample questions, it provides accurate responses.
Screenshot showing successful RAG response
Note: When asking 4o the same questions, it sometimes provides inaccurate information (close, but not quite right).
Screenshot showing 4o's response with inaccuracies
However, due to this being a basic implementation, there are many questions the system cannot answer.
Screenshot showing system limitations with certain queries

In future articles, I’ll explore various Advanced RAG implementations to improve the system’s performance.

Copilot Studio Features Used in This Article

コメント

Copied title and URL