Skip to main content
Blog

How XML External Entity Injection (XXE) Impacts Customers

Would you like to learn more?

Download our Pentest Sourcing Guide to learn everything you need to know to successfully plan, scope, and execute your penetration testing projects.

The addition of XXE (XML Eternal Entity Injection) attacks as a category to the OWASP top 10 in 2017 has been the result of an increased attack presence of this type of vulnerability found in many environments. Even though this attack has been possible for years, major web applications such as Facebook’s third-party career service and PayPal’s Ektron CMS have caused this vulnerability to gain much-needed attention.

Attackers have utilized XXE to exploit poorly configured XML processors, which in many cases are set by default, to allow the specification of an external entity reference within XML documents. Through the use of uploading of XML documents or by manipulating vulnerable code and third-party dependencies, attackers have found ways that may expose this vulnerability by taking advantage of external entities for attacks such as: remote code execution, disclosure of sensitive information, access to SMB file shares, Server-Side Request Forgery (SSRF), data extraction, internal system/port scans, and denial-of-service.

Today, we define not only what XML is, but also how its impact remains relevant in 2025 and beyond.

What is XML?

Extensible Markup Language (XML) was first created for use among desktop publishing services but has now become a popular way for various types of applications to exchange data among each other and is typically used in many situations more than HTML for data interchange. This has made XML an extremely popular data format that is implemented in many types of web applications, services, and documents.

This allows two systems running different technologies to communicate and exchange data. In order for XML data to be interpreted, the applications need some form of XML parser or XML processor that is capable of understanding its format to either transfer the data to another format or simply output the result.

A typical example of an XML document, which in this case describes books, that a web application can accept as XML input, parses, and outputs the result is shown below. The root element of this document is “bookstore”, which contains child elements called “book”. The “book” contains sub-child elements “author”, “title”, and “publish_date”. If this basic XML document shown below is either sent in a request or uploaded to a web application that has been configured to accept and is capable of parsing XML input, the output after it was parsed would display the sub-child elements.

Request of XML File:

xxe-code1.png

Response:

XXE2.png

Figure 1: Response from XML Request

XML's Impact in 2025

Even though the mechanics of XXE attacks have been understood for years, the modern web ecosystem has evolved, and so have the risks.

In 2025, XXE remains a persistent and relevant threat, largely due to three ongoing factors:

1. Legacy and Hybrid Environments Persist

Many organizations still rely on legacy applications, monolithic systems, or XML-based integrations between internal and external systems. While newer APIs favor JSON and REST, SOAP and XML data formats remain common in financial services, healthcare, and government sectors, where modernization is slow due to regulatory and operational complexity.

These environments continue to use XML parsers and libraries that can be misconfigured or left unpatched... leaving them vulnerable to XXE injection.

2. Cloud and Microservice Interconnections Increase Attack Surface

In today’s cloud-native and containerized infrastructures, microservices communicate extensively via APIs that often parse data from multiple sources. Even a single insecure parser or third-party dependency can expose internal services.

Threat actors in 2025 increasingly exploit XXE flaws to perform:

  • Server-Side Request Forgery (SSRF) against internal cloud metadata endpoints

  • Lateral movement between containers or virtual machines

  • Data exfiltration from private buckets or databases

These tactics are often chained with other vulnerabilities, including path traversal and deserialization flaws, to achieve deeper compromise.

3. AI and Automation Tools Amplify Exploitation

The rise of AI-assisted vulnerability scanning and exploitation frameworks has made it easier for attackers to automatically detect XXE vulnerabilities at scale. Tools leveraging large language models (LLMs) and autonomous scanning pipelines can now rapidly identify unsafe XML parsing behaviors across public and private codebases.

This automation has increased the frequency and precision of exploitation attempts, even in environments where XXE would have previously gone unnoticed.

XML Risks Remain Serious in 2025

A successful XXE attack in 2025 can still lead to:

  • Disclosure of sensitive data from local or remote systems

  • Remote code execution (RCE) through malicious entity payloads

  • Access to SMB file shares and internal network resources

  • Server-Side Request Forgery (SSRF) and cloud metadata exposure

  • Denial-of-service (DoS) conditions due to parser recursion

In complex enterprise systems, an XXE vulnerability may also serve as a pivot point: a foothold that enables further exploitation of downstream services or supply chain dependencies.

What Are XML External Entities?

XML documents can contain “entities” that are defined within the DOCTYPE header and have the ability to access remote external systems or local content found within the server hosting the web application and XML parser. When the web application parses the XML document, it has the ability to replace the “entity” with the value that is specified.

This XML Scheme Definition (XSD, newer) or Document Type Definitions (DTD, legacy) are used to validate XML documents by declaring what type of document will be defined so the parser knows how to process it. The issue here is that even though DTDs are an older legacy way of defining the type of document being used before it is processed, it is still very commonly used by applications and can also be vulnerable to triggering XXE.

Consider an example of the XML document using DTD and the explained sections found below. Once the XML entity “xxe” is parsed as “&xxe” it will be triggered to display the defined XML entity.

XML Data Type Definition (DTD)

XXE3.png

Figure 2: XML Data Type Definition

Request of XML File:

xxe-code2.png

Response:

XXE5.png

Figure 3: Response from XML Request

XXE Attack Scenario

Attackers can take advantage of the XML external entities to use this vulnerability to utilize its external functionality. Consider the following malicious XXE example of leveraging the “SYSTEM” identifier to access local content on a system hosting the XML PHP application parser.

Using an identifier which is declared as the “SYSTEM” identifier instructs the parser that the entity value should be read from the URI that follows. In many cases, the XXE vulnerability can also be an example of how an attacker can leverage this misconfiguration of the XML parser essentially turning it into a proxy server so they can execute Server-Side Request Forgery (SSRF) attacks, and gain access further into the intranet network or possibly connect to external public servers from behind the firewall.

XXE-2.jpg

Figure 4: XXE Attack Flow

The attack scenario will continue with the same bookstore theme that consists of some form of a simple PHP application that is hosted to accept book input from users containing the author, title, and publishing_date. This information can be sent through a POST command to the applications website.

An attacker can utilize the XML entities definition and SYSTEM identifier on the XML parser to accept maliciously crafted requests containing XML files that are seemingly harmless to the firewall or the application because the functionality of these services are not being directly attacked. In the examples below the external entity “xxe” would display the contents of file:///etc/passwd by performing a rudimentary LFI-related attack.

Request of XML File:

xxe-code3.png

Response:

XXE-POC8.png

Figure 5: XML Attack POC

The attacker is not confined to only accessing local files on the local exploited machine. They can recreate a RFI type of attack where they can access files remotely via http.

This type of attack can also be used to circumvent firewalls and gain access to other internal systems within the intranet that regularly would not be available to the attacker. Depending on the XML parser, it may be possible to access the contents of files from other systems on the local network through HTTP requests that are completely behind the protection of external firewalls. In the example below the file malicious.txt contains the content “Remote file accessed via http!!”, and is located on the external website http://myattacksite.

Contents of malicious.txt

XXE-POC9.png

Figure 6: XXE File Inclusion

Request of XML File:

xxe-code4.png

Response:

XXE-POC11.png

Figure 7: XXE File Inclusion POC

In situations where the PHP code on the targeted web server has the “expect” module enabled, it can increase the severity of the situation by allowing remote code execution via PHP. In some cases, this may also provide the ability to conduct port scanning of the internal network for further lateral movement and reconnaissance of the organization's infrastructure.

Request of XML File:

xxe-code5.png

Response:

XXE-POC13.png

Figure 8: XXE Code Execution POC

XXE Attack Remediation

XXE attacks can be a major risk to any organization and can result in severe consequences. The main vulnerability exists in that the XML parser parses the untrusted data sent by any user, which can become malicious in nature. However, it may not be easy or possible to validate only data present within the system identifier in the DTD. The other main issue is that most XML parsers are vulnerable to XML external entity attacks (XXE) because this configuration is set by default.

Therefore, the best solution would be to configure the XML processor to use a local static DTD and disallow any declared DTD included in the XML document. The simplest and safest way to prevent against XXE attacks it to completely disable Document Type Definitions (DTDs) altogether, especially if they are not essential to the application’s functionality. Detailed guidance on how to disable XXE processing, or otherwise defend against XXE attacks is presented in the XML External Entity (XXE) Prevention Cheat Sheet.

  • Avoid allowing application functionality that parses XML documents

  • Implement input validation that prevents malicious data from being defined with the SYSTEM identifier portion of the entity within the document type declaration (DTD)

  • Configure the XML parser to not validate and process any declarations within the DTD

  • Configure the XML parser to not resolve external entities within the DTD

Conclusion

The Packetlabs team is composed of highly trained and experienced ethical hackers who excel at the discovery, exploiting, and chaining together of vulnerabilities that often are overlooked.

Contact Us

Speak with an Account Executive

Packetlabs Company Logo
    • Toronto | HQ
    • 401 Bay Street, Suite 1600
    • Toronto, Ontario, Canada
    • M5H 2Y4
    • San Francisco | HQ
    • 580 California Street, 12th floor
    • San Francisco, CA, USA
    • 94104