On 06/07/2012 08:13 AM, Paolo Bonzini wrote: > Il 07/06/2012 14:50, Eric Blake ha scritto: >>>> The fix could be to have two different locale_charset() functions, >>>> one that returns "US-ASCII" and another one that returns "UTF-8". >>>> The first one to be used when MB_CUR_MAX and mbrtowc() are used as >>>> well, the second one to be used by gettext(). But the separation >>>> line between the two cases is not yet clear to me. Any insights? > > The separation line is what you wrote: whether you'll use the text > simply for presentation, or whether you'll process it before. But > alternatively, we might try a variant of what Eric has suggested... > >> On OS X, can we wrap MB_CUR_MAX to pretend to be 1 when in the "C" >> locale, to match what cygwin did in distinguishing between 'C' and >> 'C.UTF-8'? > > ... which is to wrap MB_CUR_MAX and pretend that it is 3.
Actually, MB_CUR_MAX of UTF-8 is 6, thanks to surrogate pairs. -- Eric Blake [email protected] +1-919-301-3266 Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature
