* Shane Dempsey (shdempse) wrote:
> I am using libxml2 and the xmlTextReader to parse the xml content below.
>
>Libxml somehow interprets the content contained in the xml node and uses 
>that information to encode the parsed content resulting in the insertion 
>of the  character. Is there a way to stop the libxml2 from interpreting 
>this i.e. charset=iso-8859-15?
>
>XML to process :
>==============
><SPAN style="FONT-STYLE: normal; FONT-FAMILY: Segoe UI; COLOR: #1a1a1a; 
>FONT-SIZE: 10pt; FONT-WEIGHT: normal; TEXT-DECORATION: none">&nbsp;meta 
>http-equiv="content-type" content="text/html; charset=iso-8859-15" 
>/</SPAN>
>
>Processed XML
>=============
><span>Â meta http-equiv=&quot;content-type&quot; 
>content=&quot;text/html; charset=iso-8859-15&quot; /</span>

Your XML document is not well-formed, the `&nbsp;` is not one of the
pre-defined named entities and there is no document type declaration.
So you are probably not showing us the whole input, or at not telling
us exactly how you are processing it. Anyway, `nbsp` is usually de-
fined as U+00A0, a non-breaking space, and the UTF-8 encoding of that
character when incorrectly interpreted as ISO-8859-x will look similar
to the string you say is being inserted.
-- 
Björn Höhrmann · mailto:bjo...@hoehrmann.de · http://bjoern.hoehrmann.de
Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml

Reply via email to