On Fri, Feb 27, 2009 at 2:35 PM, Shawn Erickson <[email protected]> wrote:
> On Fri, Feb 27, 2009 at 2:15 PM, Martin Wierschin <[email protected]> wrote:
>
>> On 2009.02.27, at 5:58 AM, Michael Ash wrote:
>>
>>> HFS+ only accepts non-UTF-8 by URL-encoding (!) the non-UTF-8 bytes
>>
>> Wow, that's pretty horrific.
>
> It also isn't really correct. HFS+ doesn't use UTF-8 it uses and
> stores Unicode (fully decomposed and in canonical order).

Mostly decomposed :) (i.e. it doesn't use NFD, but it uses something
pretty close to NFD)

> http://developer.apple.com/technotes/tn/tn1150.html#HFSPlusNames
>
> I don't think URL encode ever comes into play in HFS+ or in the POSIX
> APIs that takes UTF-8 (decomposed) paths

The POSIX APIs take UTF-8, regardless of the
composition/decomposition. That is, both of these lines open the same
file:

    fopen("\xC3\xA9","w"); //é, composed
    fopen("e\xCC\x81","w"); //é, decomposed

>... not sure what Michael is
> talking about.

On Leopard, invalid bytes will indeed be escaped:

[c...@ccox-macbook:~/temp]% ls
a.out   test.c
[c...@ccox-macbook:~/temp]% cat test.c
#include <stdio.h>

int main() {
    fopen("\"\xFF\"","w");
    return 0;
}
[c...@ccox-macbook:~/temp]% cc test.c && ./a.out
[c...@ccox-macbook:~/temp]% ls
"%FF"   a.out   test.c

-- 
Clark S. Cox III
[email protected]
_______________________________________________

Cocoa-dev mailing list ([email protected])

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to [email protected]

Reply via email to