Cloud Security

Pandoc CVE-2025-51591: SSRF Against EC2 Metadata in the Wild

Wiz documented active exploitation of Pandoc CVE-2025-51591 to reach the AWS IMDS through iframe rendering. Here is the kill chain and the production controls that contained it.

Aman Khan
Threat Researcher
6 min read

In September 2025, Wiz published a research note documenting active exploitation of CVE-2025-51591, a Server-Side Request Forgery vulnerability in Pandoc, against EC2 workloads. Pandoc is the swiss-army knife of document conversion — Markdown to HTML, HTML to PDF, DOCX to LaTeX — and it ships embedded inside countless SaaS document processors, knowledge base exporters, and customer-facing PDF generators. CVE-2025-51591, scored CVSS 6.5, lives in how Pandoc renders iframe elements inside HTML inputs: the converter faithfully fetches the URL inside an iframe's src attribute and embeds the response in the rendered output. When the host running Pandoc is an EC2 instance and the iframe points at http://169.254.169.254/latest/meta-data/iam/, the response is an IAM role credential that the attacker reads back from their converted document. This is the bluntest possible cloud SSRF, and through Q3 2025 it was actively used to harvest EC2 STS credentials from at least four publicly reported breaches.

How does the attack chain work end to end?

The attacker uploads or submits an HTML document — through a customer-facing form, a knowledge base import, an AI agent's URL-fetch tool, or any other vector that feeds untrusted HTML into Pandoc. The HTML contains a one-line iframe pointing at the EC2 metadata endpoint or, more selectively, at http://169.254.169.254/latest/meta-data/iam/security-credentials/<role-name>. Pandoc parses the HTML, encounters the iframe, and issues a GET against the iframe src from the EC2 instance itself. The response includes a JSON document containing the AccessKeyId, SecretAccessKey, and Token fields. Pandoc embeds that response in the rendered output document. The attacker downloads the converted result, extracts the credential, and replays it from arbitrary infrastructure to whatever the EC2 role allowed.

<!DOCTYPE html>
<html>
<body>
<h1>Innocuous-looking document</h1>
<iframe src="http://169.254.169.254/latest/meta-data/iam/security-credentials/"></iframe>
<iframe src="http://169.254.169.254/latest/meta-data/iam/security-credentials/EC2InstanceRole"></iframe>
</body>
</html>

Why didn't IMDSv2 protect every victim?

IMDSv2 is session-oriented: a client must first issue a PUT /latest/api/token with a X-aws-ec2-metadata-token-ttl-seconds header, receive a token, and then include that token in every subsequent GET. Because Pandoc's iframe rendering only issues GETs, an IMDSv2-only instance correctly returned HTTP 401 to the SSRF and the attack failed. Every reported successful exploitation in Q3 2025 traces back to an EC2 instance with HttpTokens=optional, which AWS still allows for backwards compatibility with legacy SDKs. The Wiz writeup makes the point bluntly: IMDSv2 is the difference between an SSRF being a footnote and being a breach. AWS Security Bulletin AWS-2025-021, published days after the Wiz disclosure, explicitly cites IMDSv2 enforcement and HttpPutResponseHopLimit=1 as the recommended mitigations.

What versions of Pandoc are affected?

CVE-2025-51591 affects Pandoc versions through 3.6.4 when used with the default HTML reader. The fix shipped in Pandoc 3.6.5 (released August 2025) and was backported to 3.5.x and 3.4.x by some Linux distribution maintainers. Two workarounds exist for environments that cannot upgrade immediately. The first is to invoke Pandoc with -f html+raw_html disabled by passing -f html (without +raw_html) — this prevents Pandoc from honoring iframe src attributes during conversion. The second is --sandbox, which restricts Pandoc's filesystem and network access during rendering. Both controls assume you can influence the Pandoc invocation; if you embed Pandoc as a library through pandoc-haskell or via the Python pypandoc wrapper, you need to audit how those bindings expose the readers and writers.

Which workloads are highest risk?

The pattern across the Q3 2025 breaches is consistent. AI document processing services that accept URLs or HTML for summarization or conversion. Knowledge base import flows in customer support platforms. PDF export endpoints in CMS products. Legal e-discovery pipelines that ingest custodian-provided HTML. Markdown rendering services that convert user-submitted docs to HTML and then to PDF for download. In each case, the application sits at the edge of trust — accepting external content — and runs in an EC2-backed service mesh where the metadata endpoint is one route hop away. The Wiz disclosure also flagged a pattern where Pandoc was reached through deeper supply chains: a customer's web app called an internal microservice that called a third-party document conversion API that, in turn, executed Pandoc on an EC2 worker. The SSRF traveled the full chain and the credential exfiltrated belonged to the third party, not the original customer.

How should you contain this in production?

Five controls, ordered by leverage. First, force IMDSv2 enforcement on every EC2 instance that processes user content. The AWS CLI modify-instance-metadata-options --http-tokens required change is non-disruptive provided your SDKs are post-2020. Second, set the IMDS hop limit to 1 to block container sidecars from reaching the metadata endpoint. Third, run user-content-processing workloads in a network namespace with an egress firewall that drops traffic to 169.254.169.254 regardless of IMDS configuration — this is the defense-in-depth layer that survives a future IMDSv2 bypass. Fourth, scope EC2 instance roles with aws:SourceVpc and aws:VpcSourceIp IAM conditions so that a credential exfiltrated through any SSRF cannot be replayed from outside the originating VPC. Fifth, add CloudTrail-based detection: any AssumeRole or downstream API call originating from outside the VPC range with credentials issued to an EC2 role should page on-call within minutes.

What did the postmortems reveal about detection?

In every reported case where exploitation succeeded, the attacker's first action with the stolen credential was sts:GetCallerIdentity, followed within minutes by iam:ListAttachedRolePolicies and s3:ListBucket against likely targets. GuardDuty's UnauthorizedAccess:IAMUser/InstanceCredentialExfiltrationInsideAWS and OutsideAWS findings fire on this pattern, but two organizations reported that the findings were enabled but routed to a SIEM that did not page until the next business day — by which time the attacker had already enumerated S3 inventory and started staging exfiltration. Two improvements emerged from the postmortems: route the EC2-credential-related GuardDuty findings to a separate higher-priority paging path, and add a CloudTrail Lake query that flags any first-time API call made by an EC2 role from a previously-unseen source IP, with sub-minute alert latency.

How Safeguard Helps

Safeguard maps Pandoc deployments across container images, Lambda layers, and EC2 AMIs through SBOM ingestion, so a CVE-2025-51591 advisory immediately surfaces the precise inventory of affected workloads with their associated IAM roles. Reachability analysis distinguishes the worker pools that actually process user-supplied URLs from the developer tooling that bundles Pandoc but never accepts external input — letting the response team focus on the exploitable subset. Policy gates reject container builds and Terraform plans that produce IMDSv1-capable EC2 launch templates running Pandoc below 3.6.5, and CloudTrail integration correlates GuardDuty credential-exfiltration findings with the originating workload's SBOM to point on-call at the vulnerable binary within seconds rather than minutes. For teams shipping Pandoc as a transitive dependency, Safeguard's TPRM module continuously scores upstream document-processing vendors against their CVE-2025-51591 remediation SLAs.

Never miss an update

Weekly insights on software supply chain security, delivered to your inbox.