Re: NSString's handling of Unicode extension B (and C) characters

2009-11-07 Thread John Engelhart
On Sat, Nov 7, 2009 at 11:01 AM, Alastair Houghton < alast...@alastairs-place.net> wrote: > On 7 Nov 2009, at 14:17, Ryan Homer wrote: > > On 2009-11-06, at 12:42 PM, Clark Cox wrote: >> >> Is "ü" a single character, or two characters? >>> >> >> When you define a string using ü, isn't it stored

Re: NSString's handling of Unicode extension B (and C) characters

2009-11-07 Thread Clark Cox
On Sat, Nov 7, 2009 at 6:17 AM, Ryan Homer wrote: > [SOLVED] > > On 2009-11-06, at 12:42 PM, Clark Cox wrote: > >> On Fri, Nov 6, 2009 at 5:22 AM, Ryan Homer wrote: >>> >>> On 2009-11-05, at 1:42 PM, Clark Cox wrote: >>> >>> Yes. I am importing characters from a text file and need to process them

Re: NSString's handling of Unicode extension B (and C) characters

2009-11-07 Thread Alastair Houghton
On 7 Nov 2009, at 14:17, Ryan Homer wrote: On 2009-11-06, at 12:42 PM, Clark Cox wrote: Is "ü" a single character, or two characters? When you define a string using ü, isn't it stored internally as one UTF-16 code unit (not sure if I'm using the notation correctly), represented as U+00FC

Re: NSString's handling of Unicode extension B (and C) characters

2009-11-07 Thread Ryan Homer
[SOLVED] On 2009-11-06, at 12:42 PM, Clark Cox wrote: On Fri, Nov 6, 2009 at 5:22 AM, Ryan Homer wrote: On 2009-11-05, at 1:42 PM, Clark Cox wrote: Yes. I am importing characters from a text file and need to process them in a certain way. A word may have an alternate form which is denoted

Re: NSString's handling of Unicode extension B (and C) characters

2009-11-05 Thread Douglas Davidson
On Nov 5, 2009, at 10:42 AM, Clark Cox wrote: You don't even have to involve characters outside of the basic multilingual plane for this to be an issue. Take, for example, the string "müssen" (i.e. the verb "must" in German). There are two ways of representing this string, one of which will hav

Re: NSString's handling of Unicode extension B (and C) characters

2009-11-05 Thread Clark Cox
On Thu, Nov 5, 2009 at 8:04 AM, Ryan Homer wrote: > Actually, > > That was a bad example since \u only allows up to 4 digits, so the string > was in fact a length of 3 characters, the character '5' being the 3rd. > However, the issue still seems to exist. > > I have the actual characters in a text