On Wed, Aug 12, 2009 at 16:22, Meikel Brandmeyer<m...@kotka.de> wrote: > > Hi Stephen, > > On Aug 12, 3:57 pm, "Stephen C. Gilardi" <squee...@mac.com> wrote: >> > I have to parse some XML files with c.x/parse. However the files >> > contain UTF-8 characters, which end up as '?' after being parsed by >> > c.x/parse. Is there some possibility to correctly parse the files? I >> > suspect there is some settings somewhere in my Clojure/JVM/System >> > which makes the whole thing fail, but I have no clue how to find out >> > where to look... > >> Does this help get you going: >> >> http://groups.google.com/group/clojure/msg/0f6dc9ec66b852fe > > Thanks for the tip. Unfortunately, it doesn't help. Now everything is > completely chopped to pieces. > >> More generally, you should also be able to specify the encoding by >> arranging for an InputStreamReader with a properly specified >> "charset" (like "UTF8") to wrap your input byte source. > > I tried, but c.x/parse only accepts an InputStream. I didn't find > a way to set the charset and that one...
You shouldn't have to. XML is funny that way: InputStream is a stream of *bytes*, not characters. XML will try to parse as UTF-8 if it doesn't find a <?xml ... ?> header specifying some other encoding. So, in your case it should "just work" unless the files I believe to be UTF-8 aren't actually UTF-8. // Ben --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~----------~----~----~----~------~----~------~--~---