I've got an issue where the clojure.xml/parse and /emit functions are
not symmetric with respect to how attributes are read and written.
The parser decodes HTML entities (e.g. & -> &) however the emitter
does not re-encode them:

user> (require ['clojure.xml :as 'xml])
nil
user> (xml/emit (xml/parse (org.xml.sax.InputSource.
                               (java.io.StringReader.
                                "<?xml version='1.0' encoding='UTF-8'?
><whatever name='Stuff &amp; Things' />"))))

<?xml version='1.0' encoding='UTF-8'?>
<whatever name='Stuff & Things'/>
nil

As the decoding seems to be done in the Sax parser, I suppose the
easiest way to handle this issue is to re-encode the attributes before
they are written.  I wrote this (not saying it's the best way by any
means):

(defn encode-html-entities [s]
        (loop [ret s
               [[char replacement] & rest] [["&" "&amp;"]
                                            ["'" "&apos;"]
                                            ["\"" "&quot;"]
                                            ["<" "&lt;"]
                                            [">" "&gt;"]]]
          (if (nil? char)
            ret
            (recur (.replaceAll ret char replacement)
                   rest))))

This could be put in clojure.xml/emit-element around the call to (val
attr).

Thoughts?
-Wayne
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To post to this group, send email to clojure@googlegroups.com
To unsubscribe from this group, send email to 
clojure+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to