PJ Fanning created CAUSEWAY-3835: ------------------------------------ Summary: suggested improvments to _DocumentFactories.java Key: CAUSEWAY-3835 URL: https://issues.apache.org/jira/browse/CAUSEWAY-3835 Project: Causeway Issue Type: Task Components: Tooling Reporter: PJ Fanning Assignee: Andi Huber
https://github.com/apache/causeway/blob/982de018229db2a097080ade53ccfbb4cceffd12/commons/src/main/java/org/apache/causeway/commons/internal/codec/_DocumentFactories.java 1. In `public Document parseDocument(final @Nullable String xml)`, you can avoid the getBytes call that wastes memory and that could be an incorrect assumption about the char encoding - not all XML originates as UTF-8 and if you already have it in String format, you don't need to convert it back to bytes (forcing the XML parser to turn into back into chars). ``` try(var sw = new StringWriter(xml)) { var doc = documentBuilder.parse(new InputSource(sw)); return doc; } ``` 2. TransformerFactory is susceptible to XML attacks https://cheatsheetseries.owasp.org/cheatsheets/XML_External_Entity_Prevention_Cheat_Sheet.html That page suggests setting: ``` TransformerFactory tf = TransformerFactory.newInstance(); tf.setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, ""); tf.setAttribute(XMLConstants.ACCESS_EXTERNAL_STYLESHEET, ""); ``` -- This message was sent by Atlassian Jira (v8.20.10#820010)