Hi Joost…

hmmm, that is unfortunately not the reality. In fact any newlines/returns 
in attributes are collapsed to a single space (saw mentioned somewhere that 
this is officially so). This is also what happens here with 
clojure.data.xml:
(prn (-> (xml/emit-str (xml/element :foo {:bar "Baz\r\nquux"}))
         (xml/parse-str)))
;; => #xml/element{:tag :foo, :attrs {:bar "Baz quux"}}

Ciao

…Jochen


Am Donnerstag, 9. November 2017 11:56:18 UTC+1 schrieb Joost:
>
> Hi Jochen
>
> Since newlines and crs are allowed in attribute values, you don't need to 
> escape them. The correctly escaped version and the unescaped version of the 
> XML are exactly equivalent.
>
> Joost.
>
> On Thursday, November 9, 2017 at 9:48:07 AM UTC+1, Jochen wrote:
>>
>> Hi…
>>
>> an unexpected problem that I currently  face is about newline and return 
>> characters in xml attribute values.
>>
>> I first found it in good old clojure.xml and thought to fix it with 
>> Clojure.data.xml (0.0.8 and 0.2.0-alpha3 tried), but it did not help. 
>>
>> When I read in some external xml with escaped cr/lf (
) in an 
>> attribute value, it is parsed correctly., but when I write it out again the 
>> \r\n appears in the output unescaped, breaking the attribute value on next 
>> read.
>>
>> Here is some real stuff showing the issue:
>>   ;; cr lf are not escaped:
>>   (prn (xml/emit-str (xml/element :foo {:bar "Baz\r\nquux"})))
>>   ;; => "<?xml version=\"1.0\" encoding=\"UTF-8\"?><foo 
>> bar=\"Baz\r\nquux\"></foo>"
>>
>>   ;; if we escape manually, the ampersand is escaped:
>>   (prn (xml/emit-str (xml/element :foo {:bar "Baz&#13;&#10;quux"})))
>>   ;; => "<?xml version=\"1.0\" encoding=\"UTF-8\"?><foo 
>> bar=\"Baz&amp;#13;&amp;#10;quux\"></foo>"
>>   
>> ;; we can fix after the fact with some string replacer, but this feels 
>> really hacky:
>>   (prn (-> (xml/emit-str (xml/element :foo {:bar "Baz&#13;&#10;quux"}))
>>            (str/replace "&amp;#" "&#")))
>>   ;; => "<?xml version=\"1.0\" encoding=\"UTF-8\"?><foo 
>> bar=\"Baz&#13;&#10;quux\"></foo>"
>>
>>   ;; although it is then reparsed correctly: 
>>   (prn (-> (xml/emit-str (xml/element :foo {:bar "Baz&#13;&#10;quux"}))
>>            (str/replace "&amp;#" "&#")
>>            (xml/parse-str)))
>>  ;;=> #clojure.data.xml.Element{:tag :foo, :attrs {:bar "Baz\r\nquux"}, 
>> :content ()} 
>>
>> Looking into the emit code, it seems like this is a Java XMLStreamWriter 
>> issue?!
>>
>> Any idea how to fix this in a clean way?
>>
>> Ciao
>>
>> …Jochen
>>
>>
>  

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to