It would probably help if you gave a real example, a REST call that returns something (presumable JSON or XML) that contains a URL that is problematic.
FWIW, the following do also work ('https://en.wikipedia.org/wiki/' , 'Česká republika' urlEncoded) asUrl. ('https://en.wikipedia.org/wiki/' , 'Česká republika' urlEncoded) asUrl retrieveContents. > On 10 Sep 2018, at 14:16, Petr Fischer via Pharo-users > <pharo-users@lists.pharo.org> wrote: > > > From: Petr Fischer <petr.fisc...@me.com> > Subject: Re: [Pharo-users] ZnURL and parsing URL with diacritics > Date: 10 September 2018 at 14:16:53 GMT+2 > To: Any question about pharo is welcome <pharo-users@lists.pharo.org> > > > OK. Thanks for examples. But in my case, the bad URL (with diacritics) comes > directly from the Zomato.com REST API (they probably do not read specs), so > I'll end up with a few "hacks" with strings. > > pf > > >> Hi, >> >>> On 10 Sep 2018, at 12:53, PBKResearch <pe...@pbkresearch.co.uk> wrote: >>> >>> Hi Petr >>> >>> I have used #urlEncoded in the past, with success, to deal with German >>> umlauts. The secret is to urlEncode just the part containing the >>> diacritics. If you encode the whole url, the slashes are encoded, and this >>> confuses Zinc, which segments the url before decoding. >>> >>> So I would expect you to be able to read your file with: >>> >>> ZnEasy get: 'http://domain.com/’,’ěščýž.html' urlEncoded. >>> >>> However, this also fails with ‘ASCII character expected’, and I can’t >>> understand why. The debug trace has too many levels for me to understand. >>> Zinc is evidently getting in a mess trying to decode the urlEncoded string, >>> but if we try: >>> >>> ’ěščýž.html' urlEncoded urlDecoded >>> >>> as a separate operation, it works OK. >>> >>> I think only Sven can explain this for you. >> >> The external representation of a URL with special characters is not the same >> as what an address bar or browser search field accepts. The latter is quite >> intelligent and accepts much broader input. >> >> ZnUrl parses the official external representation according to the spec. >> >> Internally, ZnUrl represents all components as resolved strings. The >> solution is to construct difficult/special URLs by hand. >> >> Here is an example: let's say we want to access the English Wikipedia page >> of the Czech Republic (the country) using its native name 'Česká republika' >> (which is not only non-ASCII, but non-Latin1 as well, so it needs a >> WideString and UTF-8 encoding). >> >> Here is one way to construct such a string. >> >> ZnUrl new >> scheme: #http; >> host: 'en.wikipedia.org'; >> addPathSegment: 'wiki'; >> addPathSegment: 'Česká republika'; >> yourself. >> >> Which gives a URL with the following external representation: >> >> http://en.wikipedia.org/wiki/%C4%8Cesk%C3%A1%20republika >> >> This can be parsed without problems. >> >> 'http://en.wikipedia.org/wiki/%C4%8Cesk%C3%A1%20republika' asUrl. >> >> You can send #retrieveContents to a URL to actually fetch it. >> >> ZnUrl new >> scheme: #http; >> host: 'en.wikipedia.org'; >> addPathSegment: 'wiki'; >> addPathSegment: 'Česká republika'; >> retrieveContents. >> >> Or you could use the url in a ZnClient object. >> >> BTW, there are many ways to construct URLs, I would maybe do the following. >> >> 'https://en.wikipedia.org/wiki' asUrl addPathSegment: 'Česká republika'; >> yourself. >> >> Or something like >> >> ZnClient new >> url: 'https://en.wikipedia.org/wiki'; >> addPathSegment: 'Česká republika'; >> get. >> >> HTH, >> >> Sven >> >>> HTH >>> >>> Peter Kenny >>> >>> >>> From: Pharo-users <pharo-users-boun...@lists.pharo.org> On Behalf Of Petr >>> Fischer via Pharo-users >>> Sent: 10 September 2018 10:07 >>> To: pharo-users@lists.pharo.org >>> Cc: Petr Fischer <petr.fisc...@me.com> >>> Subject: [Pharo-users] ZnURL and parsing URL with diacritics >>> >>> Hello, >>> >>> when I try to parse this URL asUrl, error "ZnCharacterEncodingError: ASCII >>> character expected" occurs: >>> >>> 'http://domain.com/ěščýž.html' asUrl. >>> >>> this also does not work: >>> >>> ZnEasy get: 'http://domain.com/ěščýž.html' >>> >>> How to solve this? In the web browser, URL with diacritics is OK. >>> >>> I tried also this: >>> >>> ZnEasy get: 'http://domain.com/ěščýž.html' urlEncoded. >>> >>> but this cripples the whole URL. >>> >>> Thanks! Petr Fischer >> >> > > >