Blueinfy's blog: [Case Study] Threat Simulation of AI Agents in Microsoft Copilot Studio

Executive Summary

Blueinfy performed a focused, time-bound security review of Microsoft Copilot Studio and its implementation at ACME to assess the potential risks introduced by AI agents.

The objective of the engagement was to evaluate how AI agents both legitimate and malicious could be misused, intentionally or unintentionally, to

Access sensitive enterprise data
Expose user specific information
Perform unauthorized actions
Enable data exfiltration

The assessment combined configuration review with hands on threat simulation, where custom agents were built to replicate realistic attack scenarios as well as instructions were passed to exploit legitimate agents. The results demonstrated that even with platform level controls enabled, significant risks can persist due to configuration gaps, excessive permissions, and agent behavior manipulation.

The environments given for testing had pre-configured policies and controls applied prior to the assessment. The scope included -

AI agent configuration within Microsoft Copilot Studio
Data access patterns through agents
Connector usage and restrictions
Guardrails and safety configurations
Threat simulation using custom-built agents

Assessment Methodology

Blueinfy adopted a structured methodology combining configuration validation and adversarial testing.

1. AI Configuration Review

A focused configuration review was conducted to evaluate AI-specific settings.

Areas Reviewed:
• AI agent configuration settings
• Connector policies and restrictions
• Data access configurations
• Prompt safety and guardrails
• Logging and monitoring capabilities

Objective:
• Identify risky configurations
• Recommend controls to reduce exposure
• Highlight configurations requiring governance before enablement

2. Agent Threat Simulation

Instead of testing existing agents, Blueinfy created custom agents within the allowed policy boundaries to simulate real world attack scenarios. This approach ensured:

No disruption to production agents
Realistic exploitation within permitted configurations
Validation of platform controls under adversarial conditions

Threat Simulation Approach

Agents were built using only approved connectors and policies within the environment. Two categories of agents were designed:

1. Misuse of Legitimate Agents

Agents behaving as intended but manipulated via inputs
Exploiting trust in user prompts

2. Malicious Agent Design

Agents intentionally designed to bypass safeguards
Leveraging allowed configurations to simulate abuse

Key Attack Scenarios Tested

Blueinfy executed multiple scenarios to evaluate risk exposure:

Prompt Injection and Instruction Override - Manipulating agent behavior using crafted inputs to override system instructions and cause unintended data access
Data Exfiltration via Allowed Channels - Extracting sensitive data through email connectors, API responses and structured outputs
Cross-Agent Interaction Risks - Simulating agent-to-agent communication and demonstrating potential lateral movement
Rouge Agents – Malicious agents built with system instructions to exfiltrate data, phish users for credentials and send application/user data to unintended servers
MCP Exposure – If MCP server and tools are accessible without correct authentication and authorization mechanisms

Key Observations

The assessment revealed several important findings:

Misconfigurations Create Hidden Risks – Users of the agents are completely unaware of the data risks since once published/shared, end users have limited visibility in the agent configuration. We were able to send emails of agent users including email attachments and employee feedback responses etc. to our third-party servers.
Agents Can Be Manipulated Through Inputs – Based on the guardrails, prompt injection enabled behavior override and agents could be influenced to exfiltrate data without changing configuration. This created a scenario where legitimate agents to summarize users emails, posting summary to Teams channels etc. could be exploited to share that summary data to third-party servers via malicious instructions received in email/submitted forms.
Unauthenticated MCP Exposure – MCP Tools connect the LLM's to organization data sources like databases, knowledge sources etc. With this engagement, we were able to use the MCP tools without authentication and gain full access to client sensitive data like contract financials.
Platform Controls Are Not Sufficient Alone - While Microsoft Copilot Studio provides robust built-in controls, their effectiveness depends heavily on configuration and usage. Sadly, they do not work without being configured per your needs.

Conclusion

Blueinfy’s assessment demonstrated that while platforms like Microsoft Copilot Studio do provide strong foundational controls, they must be complemented with:

Proper configuration
Risk-aware governance
Adversarial testing
Monitoring and logging

By moving from assumption based security to evidence driven validation, ACME established a stronger foundation for secure AI adoption. Blueinfy team worked with ACME to create a robust agent threat simulation and security review process to protect against such risks with scaling agents in parallel. Please read this blog for the three-tier risk methodology for an agent review process.

Article by Hemil Shah and Rishita Sarabhai