[2025 Update] Build a Multimodal Copilot Studio Agent to Analyze Conversation Attachments (Images/PDFs)

2025 update: Copilot Studio can now handle files using standard capabilities.
Previously, this required a complex workaround using Power Automate, but today it can be completed by enabling a single setting.
This article explains the latest “standard implementation.”

*The legacy method (Copilot Studio → Power Automate → AI Builder) is archived at the end of this article for reference.

This is a continuation of my previous article, where I demonstrate how to build a multimodal agent in Copilot Studio that can read files attached to a conversation.

Previous article:


https://ippu-biz.com/development/powerplatform/copilot-studio/topic-attachments/

スポンサーリンク

Goal

I want to create an agent that can read and analyze images or PDFs attached in a conversation, like this:
Copilot Studio conversation showing an attached file (image/PDF) to be analyzed

Prerequisite: Confirm settings

First, confirm that file upload is enabled.
From [Settings]:

Copilot Studio Settings screen

In the [Generative AI] tab, make sure [Upload documents] is turned on:

Generative AI settings showing Upload documents enabled

Build (Standard method)

Create a new topic and configure the trigger so it starts when the user asks to analyze an attached image (example below).

Topic trigger configuration

Next, go to [Add a tool] and choose [New prompt].

Add a tool > New prompt”<br />  width=”1024″<br />  height=”575″<br />  class=”alignnone size-large wp-image-15014″<br />/></div><div class=

From [Add content], add [Image or document]:

Prompt builder: add Image or document input

Give the argument a name, add your prompt instructions, then click [Save].

Prompt builder: set argument name and prompt, then save

In the topic, set the prompt input using the following formula:

Setting the prompt input with a Power Fx formula

First(System.Activity.Attachments).Content

Store the output in a variable:

Assigning the prompt output to a variable

Finally, use [Send a message] and set the message to the .text property of the variable used above.

Send a message using the variable text output

Test

Attach an image and ask “Analyze it.” The agent will analyze the image.

Test run: agent analyzing an attached image

This example focuses on images, but the same approach can also analyze PDFs.
If you want to analyze Excel, Word, or PowerPoint, you’ll need additional services such as Azure AI Document Intelligence.

Related Articles

Bonus: [Legacy Method Archive] Copilot Studio → Power Automate → AI Builder

The content below is archived and outdated.

At the time, attachments were not available in System.Attachments, and prompt actions couldn’t accept file inputs directly.

Build (Legacy)

Just like the previous article, this example only targets the first attached file.

Note: If you want to handle multiple files, treat them as a Table type or use Foreach.

Create a prompt action (Legacy)

Create a new prompt from [AI hub] > [Prompts]:
AI hub navigation to create a new prompt
Prompt creation screen
Add an appropriate prompt and add an “Image or document” input.

Note: Prompt accuracy was not validated in the original legacy example.
Prompt configuration with image/document input

This completes the creation of the prompt action.

Build Power Automate flow (Legacy)

At the time of writing, Copilot Studio couldn’t pass “File” inputs to prompt actions, so the workaround was to route through Power Automate.

Reference:

https://learn.microsoft.com/ja-jp/ai-builder/add-inputs-prompt#limitations

Build a flow that takes the user’s message and the data string, then passes them to AI Builder.
Convert the data string to binary using base64ToBinary.
Power Automate flow converting base64 to binary and sending to AI Builder
Set the received message as the return value to complete the flow.
Power Automate return value configuration

Build in Copilot Studio (Legacy)

Start the conversation using “Conversation boosting” (legacy name).

Note: This example assumes a single conversation. Multi-turn scenarios may require additional testing.

When the conversation begins, get the number of attachments:
Get number of attachments
If attachments exist, redirect to a dedicated topic:
Redirect to dedicated topic when attachments exist
In the redirected topic, retrieve the contentUrl as explained in the previous article:
Retrieve contentUrl from attachment
Extract the data portion using Split and Index:
Split/Index to extract base64 data
Add the Power Automate flow, and send the user’s message (Activity.Text) and the extracted data:
Call Power Automate flow with message and file data
Display the message returned from Power Automate to complete the legacy implementation:
Display response returned from Power Automate

Legacy testing

Sending an image worked (it identified it as a cat):
Legacy test result: image classified as a cat
(As of 2025/04/18) AI Builder file input supported PNG/JPG/JPEG/PDF, so I created a sample PDF:
Sample PDF for legacy testing
When I asked questions about it, it returned a reasonable answer:
Legacy test result: agent answering questions about the PDF

If you want to read Excel or PowerPoint files, you’ll need to integrate additional services such as Azure AI Document Intelligence.

コメント

Copied title and URL