PJ Fanning created CAUSEWAY-3835:
------------------------------------
Summary: suggested improvments to _DocumentFactories.java
Key: CAUSEWAY-3835
URL: https://issues.apache.org/jira/browse/CAUSEWAY-3835
Project: Causeway
Issue Type: Task
Components: Tooling
Reporter: PJ Fanning
Assignee: Andi Huber
https://github.com/apache/causeway/blob/982de018229db2a097080ade53ccfbb4cceffd12/commons/src/main/java/org/apache/causeway/commons/internal/codec/_DocumentFactories.java
1. In `public Document parseDocument(final @Nullable String xml)`, you can
avoid the getBytes call that wastes memory and that could be an incorrect
assumption about the char encoding - not all XML originates as UTF-8 and if you
already have it in String format, you don't need to convert it back to bytes
(forcing the XML parser to turn into back into chars).
```
try(var sw = new StringWriter(xml)) {
var doc = documentBuilder.parse(new InputSource(sw));
return doc;
}
```
2. TransformerFactory is susceptible to XML attacks
https://cheatsheetseries.owasp.org/cheatsheets/XML_External_Entity_Prevention_Cheat_Sheet.html
That page suggests setting:
```
TransformerFactory tf = TransformerFactory.newInstance();
tf.setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "");
tf.setAttribute(XMLConstants.ACCESS_EXTERNAL_STYLESHEET, "");
```
--
This message was sent by Atlassian Jira
(v8.20.10#820010)