OK, you are right. Copy+paste didn't preserve the compatibility character. Does look like a bug of sorts, or at least something a unicode expert should explain.
On Mon, Dec 9, 2013 at 3:20 AM, Gerriet M. Denkmann <gerr...@mdenkmann.de>wrote: > > On 9 Dec 2013, at 16:00, Stephen J. Butler <stephen.but...@gmail.com> > wrote: > > > I don't get the same result. 10.9.0, Xcode 5.0.2. I created an empty > command line utility, copied the code, and I get NSNotFound. > > > > 2013-12-09 02:50:19.822 Test[73850:303] main "见≠見" (3 shorts) occurs in > "见=見見" (4 shorts) at {9223372036854775807, 0} > > Copying might invoke another bug. > Better check the characters, like: > > - (void)printString: (NSString *)line > { > NSLog(@"%s \"%@\" has characters:",__FUNCTION__, line); > > [ line enumerateSubstringsInRange: NSMakeRange( 0, [ line > length ] ) > options: > NSStringEnumerationByComposedCharacterSequences > usingBlock: ^(NSString *currChar, NSRange > currCharRange, NSRange enclosingRange, BOOL *stop) > { > (void)enclosingRange; > (void)stop; > > #ifdef __LITTLE_ENDIAN__ > NSStringEncoding encoding > = NSUTF32LittleEndianStringEncoding; > #else > NSStringEncoding encoding > = NSUTF32BigEndianStringEncoding; > #endif > NSData *data = [ currChar > dataUsingEncoding: encoding ]; > > NSUInteger nbrBytes = [ data > length ]; > NSUInteger nbrChars = nbrBytes / > sizeof(unsigned int); > > if ( nbrChars * sizeof(unsigned > int) != nbrBytes ) // error > { > NSLog(@"%s Error: strange > nbr of bytes %lu",__FUNCTION__, nbrBytes); > return; > }; > > unsigned int codePoint[nbrChars]; > [ data getBytes: &codePoint > length: nbrBytes ]; > > NSMutableString *s = [ > NSMutableString stringWithFormat: @"%@ = ", > > NSStringFromRange(currCharRange) > > ]; > for( NSUInteger i = 0; i < > nbrChars; i++ ) > { > [ s appendFormat: @"%#06x > ", codePoint[i] ]; > }; > > [ s appendFormat: @"= \"%@\"", > currChar ]; > > fprintf(stderr, "%s\n", [ s > UTF8String]); > } > ]; > } > > and check for: > "见=見見" has characters: > {0, 1} = 0x89c1 = "见" > {1, 1} = 0x003d = "=" > {2, 1} = 0xfa0a = "見" > {3, 1} = 0x898b = "見" > "见≠見" has characters: > {0, 1} = 0x89c1 = "见" > {1, 1} = 0x2260 = "≠" > {2, 1} = 0x898b = "見" > > > > > On Mon, Dec 9, 2013 at 2:43 AM, Gerriet M. Denkmann < > gerr...@mdenkmann.de> wrote: > > > > On 9 Dec 2013, at 15:05, Quincey Morris < > quinceymor...@rivergatesoftware.com> wrote: > > > > > On Dec 8, 2013, at 23:46 , Gerriet M. Denkmann <gerr...@mdenkmann.de> > wrote: > > > > > >> NSString *b = @"见≠見"; // 0x89c1 0x2260 0x898b > > > > > > So what are the results with: > > > > > >> NSString *b = @"见”; > > >> NSString *b = @"≠”; > > >> NSString *b = @"見”; > > > ? > > > > > > Does specifying an explicit locale make any difference? > > > > Explicit specifying en_US (as probably the best tested and debugged) > makes no difference. > > > > _______________________________________________ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com