Re: [Pharo-users] XMLParser Claims U+00A0 is “Invalid UTF-8”

Sven Van Caekenberghe Thu, 28 Jul 2016 14:30:26 -0700

In my older work image, the following just works:

XMLDOMParser parse:
('http://forum.world.st/file/n4908531/illegal-UTF-sms.xml' asUrl 
retrieveContents).


But I guess that is because my (older) XML parser version ignores the encoding, 
or is more lenient.

You could try to edit the incoming file, or have a look at #decodesCharacters: 

(XMLDOMParser on:
('http://forum.world.st/file/n4908531/illegal-UTF-sms.xml' asUrl 
retrieveContents) readStream) decodesCharacters: false; parseDocument.

But I am no expert in the deeper aspects of XML Support.

> On 28 Jul 2016, at 22:29, Sean P. DeNigris <s...@clipperadams.com> wrote:
> 
> Sven Van Caekenberghe-2 wrote
>> Your XML file is not UTF-8 encoded, it is plain Unicode. At least the way
>> it is served from the URL you gave.
>> ..
>> You see ?
> 
> Unfortunately, no! ha ha. I didn't generate the file and I took it's
> assertion that it was UTF-8 at face value. How do I properly feed the file
> into XMLParser?
> 
> 
> 
> -----
> Cheers,
> Sean
> --
> View this message in context: 
> http://forum.world.st/XMLParser-Claims-U-00A0-is-Invalid-UTF-8-tp4908525p4908539.html
> Sent from the Pharo Smalltalk Users mailing list archive at Nabble.com.
>

Re: [Pharo-users] XMLParser Claims U+00A0 is “Invalid UTF-8”

Reply via email to