XML expects a prolog in the document itself defining the encoding, if absent, the standard specifies utf-8. So when you use an XML parser to parse an HTML page, it will disregard any HTTP encodings, interpret the contents as an XML document with missing prolog, and try to parse as utf8.
When you use ZnUrl getContents however, it respects the HTTP charset header field, which correctly identifies the contents as 8859-1, and lets you correctly read it into an internal string. Subsequently parsing said internal string, the XML parser won't try to do any conversion, and therefore works. Cheers, Henry -- Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html