
PJ Fanning updated CAUSEWAY-3835:

1. In `public Document parseDocument(final @Nullable String xml)`, you can 
avoid the getBytes call that wastes memory and that could be an incorrect 
assumption about the char encoding - not all XML originates as UTF-8 and if you 
already have it in String format, you don't need to convert it back to bytes 
(forcing the XML parser to turn into back into chars).

        try(var sw = new StringWriter(xml)) {
            var doc = documentBuilder.parse(new InputSource(sw));
            return doc;


1. In `public Document parseDocument(final @Nullable String xml)`, you can 
avoid the getBytes call that wastes memory and that could be an incorrect 
assumption about the char encoding - not all XML originates as UTF-8 and if you 
already have it in String format, you don't need to convert it back to bytes 
(forcing the XML parser to turn into back into chars).

        try(var sw = new StringWriter(xml)) {
            var doc = documentBuilder.parse(new InputSource(sw));
            return doc;

2. TransformerFactory is susceptible to XML attacks


That page suggests setting:
TransformerFactory tf = TransformerFactory.newInstance();
tf.setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "");
tf.setAttribute(XMLConstants.ACCESS_EXTERNAL_STYLESHEET, "");

> suggested improvments to _DocumentFactories.java
> ------------------------------------------------
>                 Key: CAUSEWAY-3835
>                 URL: https://issues.apache.org/jira/browse/CAUSEWAY-3835
>             Project: Causeway
>          Issue Type: Task
>          Components: Tooling
>            Reporter: PJ Fanning
>            Assignee: Andi Huber
>            Priority: Major
> https://github.com/apache/causeway/blob/982de018229db2a097080ade53ccfbb4cceffd12/commons/src/main/java/org/apache/causeway/commons/internal/codec/_DocumentFactories.java
> 1. In `public Document parseDocument(final @Nullable String xml)`, you can 
> avoid the getBytes call that wastes memory and that could be an incorrect 
> assumption about the char encoding - not all XML originates as UTF-8 and if 
> you already have it in String format, you don't need to convert it back to 
> bytes (forcing the XML parser to turn into back into chars).
> ```
>         try(var sw = new StringWriter(xml)) {
>             var doc = documentBuilder.parse(new InputSource(sw));
>             return doc;
>         }
> ```

This message was sent by Atlassian Jira

Reply via email to