On Aug 27, 2012, at 12:43 PM, David Duncan wrote: > On Aug 27, 2012, at 9:50 AM, Sean McBride <s...@rogue-research.com> wrote: > >> On Sat, 25 Aug 2012 17:58:39 +0200, Uli Kusterer said: >> >>>>> const UInt8 *cpath = (const UInt8 *)[path >>> cStringUsingEncoding:NSUTF8StringEncoding]; >>>> >>>> -UTF8String is shorter. >>> >>> Both of these are wrong, though. You should *always* use - >>> fileSystemRepresentation when you need a C-string representation of a >>> path. Otherwise you might get decomposed characters that don't match the >>> actual way the characters are stored on disk, and will create a second >>> file with an almost-indistinguishable name. >> >> Could you provide an example filename where UTF8String and >> fileSystemRepresentation give something different? I'd like to run it >> through QA... > > > The primary difference is that the fileSystemRepresentation uses a particular > form of unicode composition (if I recall correctly, Decomposed form D). As > such if you had a UTF-8 encoded Ä (Latin Capital Letter A with Diaresis) vs > A¨ (Latin Capital Letter A with Combining Diaeresis) then you could get > different results (and I believe the former would be an invalid file name). > > Note: this is primarily from memory so this exact example may or may not > fail, but should give you a framework for finding ones that do.
For reference, from <https://developer.apple.com/library/mac/documentation/MacOSX/Conceptual/BPInternational/Articles/FileEncodings.html#//apple_ref/doc/uid/20002137-SW1>: > All BSD system functions expect their string parameters to be in UTF-8 > encoding and nothing else. Code that calls BSD system routines should ensure > that the contents of all const *char parameters are in canonical UTF-8 > encoding. In a canonical UTF-8 string, all decomposable characters are > decomposed; for example, é (0x00E9) is represented as e (0x0065) + ´ > (0x0301). To put things into a canonical UTF-8 encoding, use the “file-system > representation” interfaces defined in Cocoa (including Core Foundation). Also, of note, from <http://developer.apple.com/library/mac/qa/qa1173/_index.html#//apple_ref/doc/uid/DTS10001705-CH1-SECCOMPATIBILITYNOTES>, which is targeted at those implementing file systems for Mac OS X: > In theory the techniques described above can cause compatibility problems for > applications. For example, if an application creates a file using a > precomposed name and then iterates through the directory looking for that > file using a simple binary string comparison, it won't find the file. In > practice this is rarely a problem. If your code does this sort of thing, you should use -[NSString compare:] (or -compare:options:... without the NSLiteralSearch option) to compare strings. Don't use -isEqual: or -isEqualToString: because that does a literal comparison which treats precomposed and decomposed forms of the same character as unequal. Regards, Ken _______________________________________________ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com