Henry Thanks for the explanations. It's a bit clearer now. I'm still not sure about how ZnUrl>>retrieveContents manages to decode correctly in this case; I'm sure I recall Sven saying it didn't (and in his view shouldn't) look at the HTTP declarations in the header. There is also the mystery of how the string reader in the XML-Parser package (XMLURI>>get) does the same trick, when it is presumably what XMLHTMLParser>>parseURL: uses and fails.
However, all these are second order problems. It all begins because the Corriere web site does strange things with encoding, including using a UTF8 character in a page coded with 8859-1, as Paul pointed out. In any case, reading the page as a string and then parsing it solves my problem, so I shall stick to that as a standard procedure. Most importantly, I don't think there is any indication of a problem in the XML package for Monty to worry about. Thanks again Peter -- Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html