After further investigation, I modified my copy of the xml parsing code to
wrap either an InputStream or a Reader with an org.xml.sax.InputSource (and
changed the type hint):
(defn ^:private sax-parse-fn
[xml-input content-handler]
(let [input-source (cond
(or (instance? InputStream xml-input)
(instance? Reader xml-input))
(org.xml.sax.InputSource. xml-input)
(instance? org.xml.sax.InputSource xml-input)
xml-input
:else (throw (ex-info "sax-parse-fn: xml-input must
be one of InputStream, Reader, or org.xml.sax.InputSource"
{:type (type xml-input)
:class (class xml-input)})))]
(it-> (SAXParserFactory/newInstance)
(doto it
(.setValidating false)
(.setFeature "http://xml.org/sax/features/external-general-entities"
false)
(.setFeature "
http://xml.org/sax/features/external-parameter-entities" false))
(.newSAXParser it)
(doto it
(.setProperty "http://xml.org/sax/properties/lexical-handler"
content-handler))
(.parse it
^org.xml.sax.InputSource input-source
^org.xml.sax.helpers.DefaultHandler content-handler))))
(s/defn parse ; #todo fix docstring
([xml-input] (parse xml-input sax-parse-fn))
([xml-input parse-fn]
(let [result-atom (atom (xml-zip {:type :document :content nil}))
content-handler (handler result-atom)]
(parse-fn xml-input content-handler)
; #todo document logic vvv using xkcd & plain xml example
(let [parsed-data (it-> @result-atom
(first it)
(:content it)
(drop-if #(= :dtd (:type %)) it)
(drop-if #(string? %) it)
(only it))]
parsed-data))))
On Sat, Mar 2, 2019 at 4:11 PM Matching Socks <[email protected]> wrote:
> Does this need adjusting in clojure.xml too? The code looks pretty
> similar:
>
> (defn startparse-sax [s ch]
> (.. SAXParserFactory (newInstance) (newSAXParser) (parse s ch)))
>
> The reflection on "parse" is convenient. There are multiple
> SaxParser.parse methods with unique capabilities. The String method allows
> you not to know how the data is encoded. To open an InputStream, you have
> to know the encoding. It's pretty hard to open an XML instance correctly!
> And the InputSource method allows you to parse from a Reader, e.g., a
> StringReader.
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to [email protected]
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.