Re: [Pharo-users] XMLParser Claims U+00A0 is “Invalid UTF-8”

2016-07-28 Thread Sven Van Caekenberghe
m: "Sven Van Caekenberghe" >> To: "Any question about pharo is welcome" >> Subject: Re: [Pharo-users] XMLParser Claims U+00A0 is “Invalid UTF-8” >> >> In my older work image, the following just works: >> >> XMLDOMParser parse: >> (

Re: [Pharo-users] XMLParser Claims U+00A0 is “Invalid UTF-8”

2016-07-28 Thread monty
Also #parseURL:/#onURL: will use WebClient on Squeak (unless Zinc is present of course) > Sent: Thursday, July 28, 2016 at 6:15 PM > From: monty > To: pharo-users@lists.pharo.org > Subject: Re: [Pharo-users] XMLParser Claims U+00A0 is “Invalid UTF-8” > > Good for finding one

Re: [Pharo-users] XMLParser Claims U+00A0 is “Invalid UTF-8”

2016-07-28 Thread monty
their own XML-aware encoding on top of it. > Sent: Thursday, July 28, 2016 at 5:29 PM > From: "Sven Van Caekenberghe" > To: "Any question about pharo is welcome" > Subject: Re: [Pharo-users] XMLParser Claims U+00A0 is “Invalid UTF-8” > > In my olde

Re: [Pharo-users] XMLParser Claims U+00A0 is “Invalid UTF-8”

2016-07-28 Thread Sean P. DeNigris
monty-3 wrote > You're double decoding And in public, no less! Thanks. It works now with #parseFileNamed:. Minus side - half a day wasted; plus side - I wrote a compatibility layer for Magritte-XMLBinding to accept SoupTags to #fromXmlNode: - Cheers, Sean -- View this message in context: h

Re: [Pharo-users] XMLParser Claims U+00A0 is “Invalid UTF-8”

2016-07-28 Thread monty
e). So it gets decoded twice, and the decoded value of the char causes the error. I'll consider changing the heuristic to make less eager to decode. > Sent: Thursday, July 28, 2016 at 4:05 PM > From: "Sean P. DeNigris" > To: pharo-users@lists.pharo.org > Subject: Re: [Ph

Re: [Pharo-users] XMLParser Claims U+00A0 is “Invalid UTF-8”

2016-07-28 Thread Sven Van Caekenberghe
In my older work image, the following just works: XMLDOMParser parse: ('http://forum.world.st/file/n4908531/illegal-UTF-sms.xml' asUrl retrieveContents). But I guess that is because my (older) XML parser version ignores the encoding, or is more lenient. You could try to edit the incoming file,

Re: [Pharo-users] XMLParser Claims U+00A0 is “Invalid UTF-8”

2016-07-28 Thread Sean P. DeNigris
Sven Van Caekenberghe-2 wrote > Your XML file is not UTF-8 encoded, it is plain Unicode. At least the way > it is served from the URL you gave. > .. > You see ? Unfortunately, no! ha ha. I didn't generate the file and I took it's assertion that it was UTF-8 at face value. How do I properly feed th

Re: [Pharo-users] XMLParser Claims U+00A0 is “Invalid UTF-8”

2016-07-28 Thread Sven Van Caekenberghe
Sean, Your XML file is not UTF-8 encoded, it is plain Unicode. At least the way it is served from the URL you gave. (('http://forum.world.st/file/n4908531/illegal-UTF-sms.xml' asUrl retrieveContents) at: 72 ) = 160 asCharacter. "true" Like you said, 160 asCharacter asString utf8Encoded.

Re: [Pharo-users] XMLParser Claims U+00A0 is “Invalid UTF-8”

2016-07-28 Thread Sean P. DeNigris
monty-3 wrote > Just to be sure, I manually recreated your file (with the great Bless hex > editor) and parsed it with no issue. Thanks! monty-3 wrote > Please post your code and attach the actual source as a file separately. The code is merely: messageLog := FileLocator home / 'illegal-UTF-s

Re: [Pharo-users] XMLParser Claims U+00A0 is “Invalid UTF-8”

2016-07-28 Thread monty
Just to be sure, I manually recreated your file (with the great Bless hex editor) and parsed it with no issue. Please post your code and attach the actual source as a file separately. > Sent: Thursday, July 28, 2016 at 3:12 PM > From: "Sean P. DeNigris" > To: pharo-users@lists.pharo.org > Subje