Rich Felker dixit: >> >Wouldn't a 16-bit wchar_t be non-standard-conform when using a UTF-8 >> >locale? >> >> Nope. UTF-8 is just an encoding for Unicode, and as long as I take >> care to #define __STDC_ISO_10646__ 200009L (and no later date) this >> is perfectly permissible.
>This is only a possibility for implementations which only support the >BMP (Basic Multilingual Plane, aka plane 0, of Unicode, covering >Unicode Scalar Values in the range 0 to 65535). Yes, exactly. That was my goal when choosing this. (But, as I said, my suggestion wrt. handling of the encoding is not limited to 16 bit; it’s perfectly possible to handle full 21-bit Unicode/ISO-10646 with it, just my code was only written to support 16-bit BMP.) >It's fundamentally impossible in the C language to support UTF-8 with >the full Unicode range as a locale's multibyte encoding when wchar_t >is 16-bit ACK. No complaints there. This is outside of the scope of MirBSD. I will not use UTF-16 there, either. >"support" the full Unicode range using UTF-16 for wchar_t and CESU-8 >for the multibyte encoding Yikes, no! >> This just means that your C locale cannot be strictly UTF-8. All >> others can, but the C locale is precisely for this. This is because >> the C locale is special like that. > >It's not special like that in any current or past issue of the >standard, but the proposal here is to change it so it is special like >that. I object to this change. It’s not explicit yet, but ⓐ implied already (otherwise they would not announce it like that, and IMHO it’s a non-change) and ⓑ common current and expected behaviour, and, as such, sensible to require. Plus, I showed you how it can be done. bye, //mirabilos -- 13:37⎜«Natureshadow» Deep inside, I hate mirabilos. I mean, he's a good guy. But he's always right! In every fsckin' situation, he's right. Even with his deeply perverted taste in software and borked ambition towards broken OSes - in the end, he's damn right about it :(! […] works in mksh