--- Begin Message ---OK. Thanks for examples. But in my case, the bad URL (with diacritics) comes directly from the Zomato.com REST API (they probably do not read specs), so I'll end up with a few "hacks" with strings.pf > Hi, > > > On 10 Sep 2018, at 12:53, PBKResearch <pe...@pbkresearch.co.uk> wrote: > > > > Hi Petr > > > > I have used #urlEncoded in the past, with success, to deal with German > > umlauts. The secret is to urlEncode just the part containing the > > diacritics. If you encode the whole url, the slashes are encoded, and this > > confuses Zinc, which segments the url before decoding. > > > > So I would expect you to be able to read your file with: > > > > ZnEasy get: 'http://domain.com/’,’ěščýž.html' urlEncoded. > > > > However, this also fails with ‘ASCII character expected’, and I can’t > > understand why. The debug trace has too many levels for me to understand. > > Zinc is evidently getting in a mess trying to decode the urlEncoded string, > > but if we try: > > > > ’ěščýž.html' urlEncoded urlDecoded > > > > as a separate operation, it works OK. > > > > I think only Sven can explain this for you. > > The external representation of a URL with special characters is not the same > as what an address bar or browser search field accepts. The latter is quite > intelligent and accepts much broader input. > > ZnUrl parses the official external representation according to the spec. > > Internally, ZnUrl represents all components as resolved strings. The solution > is to construct difficult/special URLs by hand. > > Here is an example: let's say we want to access the English Wikipedia page of > the Czech Republic (the country) using its native name 'Česká republika' > (which is not only non-ASCII, but non-Latin1 as well, so it needs a > WideString and UTF-8 encoding). > > Here is one way to construct such a string. > > ZnUrl new > scheme: #http; > host: 'en.wikipedia.org'; > addPathSegment: 'wiki'; > addPathSegment: 'Česká republika'; > yourself. > > Which gives a URL with the following external representation: > > http://en.wikipedia.org/wiki/%C4%8Cesk%C3%A1%20republika > > This can be parsed without problems. > > 'http://en.wikipedia.org/wiki/%C4%8Cesk%C3%A1%20republika' asUrl. > > You can send #retrieveContents to a URL to actually fetch it. > > ZnUrl new > scheme: #http; > host: 'en.wikipedia.org'; > addPathSegment: 'wiki'; > addPathSegment: 'Česká republika'; > retrieveContents. > > Or you could use the url in a ZnClient object. > > BTW, there are many ways to construct URLs, I would maybe do the following. > > 'https://en.wikipedia.org/wiki' asUrl addPathSegment: 'Česká republika'; > yourself. > > Or something like > > ZnClient new > url: 'https://en.wikipedia.org/wiki'; > addPathSegment: 'Česká republika'; > get. > > HTH, > > Sven > > > HTH > > > > Peter Kenny > > > > > > From: Pharo-users <pharo-users-boun...@lists.pharo.org> On Behalf Of Petr > > Fischer via Pharo-users > > Sent: 10 September 2018 10:07 > > To: pharo-users@lists.pharo.org > > Cc: Petr Fischer <petr.fisc...@me.com> > > Subject: [Pharo-users] ZnURL and parsing URL with diacritics > > > > Hello, > > > > when I try to parse this URL asUrl, error "ZnCharacterEncodingError: ASCII > > character expected" occurs: > > > > 'http://domain.com/ěščýž.html' asUrl. > > > > this also does not work: > > > > ZnEasy get: 'http://domain.com/ěščýž.html' > > > > How to solve this? In the web browser, URL with diacritics is OK. > > > > I tried also this: > > > > ZnEasy get: 'http://domain.com/ěščýž.html' urlEncoded. > > > > but this cripples the whole URL. > > > > Thanks! Petr Fischer > >
--- End Message ---
Re: [Pharo-users] ZnURL and parsing URL with diacritics
Petr Fischer via Pharo-users Mon, 10 Sep 2018 05:18:14 -0700
- [Pharo-users] ZnURL and parsing URL with diac... Petr Fischer via Pharo-users
- Re: [Pharo-users] ZnURL and parsing URL ... PBKResearch
- Re: [Pharo-users] ZnURL and parsing ... Sven Van Caekenberghe
- Re: [Pharo-users] ZnURL and pars... Petr Fischer via Pharo-users
- Re: [Pharo-users] ZnURL and ... Sven Van Caekenberghe
- Re: [Pharo-users] ZnURL and ... Erik Stel
- Re: [Pharo-users] ZnURL and ... Sean P. DeNigris
- Re: [Pharo-users] ZnURL... PBKResearch
- Re: [Pharo-users] Z... Sean P. DeNigris
- Re: [Pharo-user... Sven Van Caekenberghe
- Re: [Pharo-user... PBKResearch
- Re: [Pharo-user... PBKResearch
- Re: [Pharo-user... PBKResearch
- Re: [Pharo-user... Sven Van Caekenberghe