On Mar 21, 2013, at 2:34 AM, Jean-Daniel Dupas <devli...@shadowlab.org> wrote:

> 
> Le 21 mars 2013 à 09:27, Luca Ciciriello <luca_cicirie...@hotmail.com> a 
> écrit :
> 
>> Hi all.
>> I'm using in my iOS project some Objective-C++ modules. Here I have some 
>> conversion from NSString to C++11 std::string. After this conversion I found 
>> (correctly) in my std::string some 2-byte characters. 
>> My question is: How can I count the number of chars and not the numbers of 
>> byte in my std::string?
>> 
> 
> Don't use std::string to store unicode string. They are not design to support 
> such content.
> 
> You can use std::wstring instead.


Actually, std::string works *just fine* for UTF-8 strings.

It's just that, in Unicode, 1 character doesn't necessarily fit in 1 byte.  
Also, you can't easily do truncation of strings (you might be truncating the 
string in the middle of a multi-byte sequence -- which is true in pretty much 
every encoding except UCS-4).

UTF-8 is relatively easy to work with, however.  You look at the previous byte 
in the string to see if your current character is part of a multi-byte sequence 
or not -- and keep going back until you find one that doesn't have the high-bit 
set, and that's the last character of the previous sequence.  Of course, that 
"go back" doesn't mean anything if you're already at the first byte in your 
string...

-- 
Glenn L. Austin, Computer Wizard and Race Car Driver         <><
<http://www.austin-soft.com>


_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to