Hi Sven,

the distinction between external and internal representations was the part I didn't get - but it really makes sense.

Your approach with adding pathSegments works perfectly! I'm working with ZnClient though and reusing it for different requests. So I added the following two (three) methods to make my life easier:

ZnClient>>#addPathSegments: pathSegments
"Modify the receiver's path by adding the elements of pathSegments at the end"

        pathSegments do: [ :each | self addPathSegment: each ]

ZnClient>>#resetPath
        self path: ''

I didn't want to overwrite #path: because I only need this for some special edge case.

ZnClient>>#webdavPath: path
        self
                resetPath;
                addPathSegments: ($/ split: path)
BTW: The whole Zinc framework is a real pleasure to work with. Once I got used to thinking in terms of objects and not only strings I didn't want to look back :-)

CU,

Udo



On 03/12/14 00:02, Sven Van Caekenberghe wrote:
Hi Udo,

With a URL/URI there are two representations: the external one (the way they 
are written) and the internal one (what is really meant). ZnUrl follows this 
distinction.

When you say #asUrl (or #asZnUrl) you are actually parsing an external string 
representation. When doing so, percent decoding is done by ZnPercentEncoder. 
This class is strict, in that it does not allow non-safe, non-ascii characters 
in its input. AFAIK this is correct, but I can imagine a less strict 
interpretation (like the URL input box of a browser would allow). If you have a 
reading of the specs that says otherwise I would be very interested.

To save you from doing the encoding yourself, you have to construct the URL 
from its parts explicitly, like this:

ZnUrl new
   scheme: #http;
   host: 'myhost';
   addPathSegments: #('path' 'with' 'unlaut' 'äöü.txt');
   yourself.

  => http://myhost/path/with/unlaut/%C3%A4%C3%B6%C3%BC.txt

Class comments and unit tests should help.

There is also this draft:

   http://stfx.eu/EnterprisePharo/Zinc-Encoding-Meta/

HTH,

Sven

PS: Incidentally, this does work

   'http://myhost/path/with/umlaut/äöü.txt' asFileReference asUrl.

because #asFileReference works differently.

On 02 Dec 2014, at 23:32, Udo Schneider <udo.schnei...@homeaddress.de> wrote:

All,

What's the expected behavior with non-ASCII characters in URLs. Let's say I want to access a 
file named "äöü.txt" - My assumption was that Zinc takes care of the UTF-8 -> 7bit 
(ASCII) -> Escape encoding. But there is either something I don't understand or some manual 
steps I'm missing.

The "straightforward" way doesn't work:
'http://myhost/path/with/umlaut/äöü.txt' asUrl. "ZnCharacterEncodingError: ASCII 
character expected"

Although the actual encoding seems to be able to handle it (ignoring the 
escapes slashes for the moment:
'http://myhost/path/with/umlaut/äöü.txt' urlEncoded.
"'http%3A%2F%2Fmyhost%2Fpath%2Fwith%2Fumlaut%2F%C3%A4%C3%B6%C3%BC.txt'"

Creating a URL from already escaped characters works as well:
'http://myhost/path/with/umlaut/%C3%A4%C3%B6%C3%BC.txt' asUrl.
"http://myhost/path/with/umlaut/%C3%A4%C3%B6%C3%BC.txt";

As does the decoding of such an URL:
'http://myhost/path/with/umlaut/%C3%A4%C3%B6%C3%BC.txt' urlDecoded.
"'http://myhost/path/with/umlaut/äöü.txt'"

At them moment I'm manually encoding UTF-8 characters in paths segments before 
trying to build the URL. But is this the correct way?

Best Regards,

Udo









Reply via email to