On Sep 28, 2008, at 1:17 PM, David Niemeijer wrote:

Michael,

On 28 sep 2008, at 14:41, Michael Gardner wrote:
Upon further investigation, I may be wrong. I based my assertion upon Apple's NSString documentation ("Returns the number of Unicode characters in the receiver"), and upon some quick tests I ran. But this reply made me look into the issue in greater depth.

I re-did my tests more throughly, and it does appear that -length returns the number of 16-bit words (code units), not the number of Unicode characters (code points), in the string. If this is true, I would call it a bug either in the code or in the documentation, which David should submit to Apple.

i think the docs are clear. In the discussion section for "length" it says: "The number returned includes the individual characters of composed character sequences, so you cannot use this method to determine if a string will be visible when printed or how long it will appear."

But composed character sequences aren't the problem; surrogate pairs are. Composed character sequences can be taken care of by using either -precomposedStringWithCanonicalMapping or - precomposedStringWithCompatibilityMapping. In my opinion, -length should take surrogate pairs into account, which is what the docs seem to imply.

-Michael
_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]

Reply via email to