On 21 Mar, 2013, at 5:07 PM, Luca Ciciriello <[email protected]> 
wrote:

> An example of string is the italian word "più". 
> Here I can visually count 3 chars: p, i and ù. But if I use "più".size() the 
> result is 4
> 
> std::string ppp = "più";
> size_t sss = ppp.size();
> 
> here sss is 4.
> 
> L.
> 

Not totally a cocoa question, it's really C++, if you were asking about the 
original NSString there are lots of cocoa methods to do things. 

The size of a string in C++ is in bytes, so it all rather depends what encoding 
your string is in, which depends on how you're doing the conversion from 
NSString to C++, which I don't think you said. 

My wild guess would be that it's UTF8, since you are getting 1-byte for most 
characters and 2-bytes for a  ù sounds correct for UTF8. What's the actual hex 
of those 4 characters for that string? 0x70 0x69 0xC3 0xB9 is I think the 
encoding for that.  

If it is UTF8 then it's not so hard, just look at the encoding detail for UTF8 
and you have to inspect the top bit or bits of each byte to determine whether 
or not it's a whole character, the start of a multibyte or a continuation of 
one. I would imagine there are C++ functions to do that. Also remember there is 
more than one way to produce a `, one is using the single unicode character for 
it, another is to use a combining diacritic mark and a 'u', that will be 4 
unicode characters and, in UTF8, 5 bytes, for that string. 
_______________________________________________

Cocoa-dev mailing list ([email protected])

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to [email protected]

Reply via email to