On 9 Dec 2013, at 14:51, Igor Elland <igor.ell...@me.com> wrote:

> Are you taking into account that 见,≠, and 見 are composed character sequences, 
> not individual unichars?
> 

This method:

- (void)printString: (NSString *)line
{       
        NSLog(@"%s \"%@\" has characters:",__FUNCTION__, line);
        
        [ line  enumerateSubstringsInRange:     NSMakeRange( 0, [ line length ] 
) 
                                options:                                        
                NSStringEnumerationByComposedCharacterSequences 
                                usingBlock: ^(NSString *currChar, NSRange 
currCharRange, NSRange enclosingRange, BOOL *stop)
                                {
                                        (void)enclosingRange;
                                        (void)stop;
                                        unichar u = [ currChar 
characterAtIndex: 0 ];
                                        NSString *s =   [ NSString 
stringWithFormat: @"%@ = %#06x = \"%@\"", 
                                                                                
                                NSStringFromRange(currCharRange), u, currChar 
                                                                        ];
                                        fprintf(stderr, "%s\n", [ s 
UTF8String]);
                                }
        ];
}

prints:

 "见=見見" has characters:
{0, 1} = 0x89c1 = "见"
{1, 1} = 0x003d = "="
{2, 1} = 0xfa0a = "見"
{3, 1} = 0x898b = "見"

 "见≠見" has characters:
{0, 1} = 0x89c1 = "见"
{1, 1} = 0x2260 = "≠"
{2, 1} = 0x898b = "見"

which shows (to my limited understanding) that there are NO composed character 
sequences.

Kind regards,

Gerriet.


> On 09 Dec 2013, at 08:46, Gerriet M. Denkmann <gerr...@mdenkmann.de> wrote:
> 
>> In 10.9.0, Xcode 5.0.2 I added these lines to applicationDidFinishLaunching:
>> 
>> NSString *a = @"见=見見";       //      0x89c1  0x3d    0xfa0a  0x898b
>> NSString *b = @"见≠見";                //      0x89c1  0x2260  0x898b
>> NSRange aRange = [ a rangeOfString: b ];
>> NSLog(@"%s \"%@\" (%lu shorts) occurs in \"%@\" (%lu shorts) at 
>> %@",__FUNCTION__, 
>>              b, [b length], a, [a length], NSStringFromRange(aRange));
>> 
>> And was told: 
>> "见≠見" (3 shorts) occurs in "见=見見" (4 shorts) at {0, 4}
>> 
>> This comes somewhat unexpected.
>> 
>> What am I doing wrong?
>> 
>> Or are my expectations false?
>> I would (maybe naively) expect that the shorter string could at most occur 
>> at {0,3}.
>> I would also expect that ≠ is NOT quite the same as =.
>> But then, I was wrong before.
>> 
>> Gerriet.


_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to