Michael,

On 28 sep 2008, at 14:41, Michael Gardner wrote:
Upon further investigation, I may be wrong. I based my assertion upon Apple's NSString documentation ("Returns the number of Unicode characters in the receiver"), and upon some quick tests I ran. But this reply made me look into the issue in greater depth.

I re-did my tests more throughly, and it does appear that -length returns the number of 16-bit words (code units), not the number of Unicode characters (code points), in the string. If this is true, I would call it a bug either in the code or in the documentation, which David should submit to Apple.

i think the docs are clear. In the discussion section for "length" it says: "The number returned includes the individual characters of composed character sequences, so you cannot use this method to determine if a string will be visible when printed or how long it will appear."

I did file a bug (ID 6253075) as you suggested, because I think there should be a simple API for this.

I apologize for the apparent misinformation in my previous, hasty reply.

Well, I mad an error too. i suggested that on 10.5 the CFStringTokenizer could be used, but only now noticed that it only supports larger units (words and up). Thus there is no easy API to count the number of characters in a way that surrogate pairs or other "long" unicode characters are treated as a single character.

In the meanwhile, David, perhaps you can find a library that can work with UTF-8 strings. What are you using the length values for?

I need to be able to display the number of characters to the user in a way that makes sense to them. If they see 3 I should report 3. I also need it to cut-off certain input to the number of "real" characters and should not generate results that only make sense for a language like English where each 16 bits equals a single character.

Using some kind of UTF-8 library may be possible, but that would require converting all the time between UTF-16 and UTF-8, which is not efficient for a program that has to do a lot of these kind of calculations.

david.
_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]

Reply via email to