On Aug 27, 2012, at 12:43 PM, David Duncan wrote:

> On Aug 27, 2012, at 9:50 AM, Sean McBride <s...@rogue-research.com> wrote:
> 
>> On Sat, 25 Aug 2012 17:58:39 +0200, Uli Kusterer said:
>> 
>>>>>    const UInt8 *cpath = (const UInt8 *)[path
>>> cStringUsingEncoding:NSUTF8StringEncoding];
>>>> 
>>>> -UTF8String is shorter.
>>> 
>>> Both of these are wrong, though. You should *always* use -
>>> fileSystemRepresentation when you need a C-string representation of a
>>> path. Otherwise you might get decomposed characters that don't match the
>>> actual way the characters are stored on disk, and will create a second
>>> file with an almost-indistinguishable name.
>> 
>> Could you provide an example filename where UTF8String and 
>> fileSystemRepresentation give something different?  I'd like to run it 
>> through QA...
> 
> 
> The primary difference is that the fileSystemRepresentation uses a particular 
> form of unicode composition (if I recall correctly, Decomposed form D). As 
> such if you had a UTF-8 encoded Ä (Latin Capital Letter A with Diaresis) vs 
> A¨ (Latin Capital Letter A with Combining Diaeresis) then you could get 
> different results (and I believe the former would be an invalid file name).
> 
> Note: this is primarily from memory so this exact example may or may not 
> fail, but should give you a framework for finding ones that do.

For reference, from 
<https://developer.apple.com/library/mac/documentation/MacOSX/Conceptual/BPInternational/Articles/FileEncodings.html#//apple_ref/doc/uid/20002137-SW1>:

> All BSD system functions expect their string parameters to be in UTF-8 
> encoding and nothing else. Code that calls BSD system routines should ensure 
> that the contents of all const *char parameters are in canonical UTF-8 
> encoding. In a canonical UTF-8 string, all decomposable characters are 
> decomposed; for example, é (0x00E9) is represented as e (0x0065) + ´ 
> (0x0301). To put things into a canonical UTF-8 encoding, use the “file-system 
> representation” interfaces defined in Cocoa (including Core Foundation).

Also, of note, from 
<http://developer.apple.com/library/mac/qa/qa1173/_index.html#//apple_ref/doc/uid/DTS10001705-CH1-SECCOMPATIBILITYNOTES>,
 which is targeted at those implementing file systems for Mac OS X:

> In theory the techniques described above can cause compatibility problems for 
> applications. For example, if an application creates a file using a 
> precomposed name and then iterates through the directory looking for that 
> file using a simple binary string comparison, it won't find the file. In 
> practice this is rarely a problem.

If your code does this sort of thing, you should use -[NSString compare:] (or 
-compare:options:... without the NSLiteralSearch option) to compare strings.  
Don't use -isEqual: or -isEqualToString: because that does a literal comparison 
which treats precomposed and decomposed forms of the same character as unequal.

Regards,
Ken


_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to