Hi Eric, > I was asking: > > should wwchar_t (or xwchar_t, but not xchar_t) be 2-bytes on cygwin, but > unlike the POSIX definition of wchar_t being always 1 character per > unit, the new type is explicitly documented as being multi-unit on some > platforms but with sane semantics > > or should it always be 4-bytes, where conversion from wchar_t to > wwchar_t requires some efforts, and where the new type must be used > everywhere (which means wrapping a lot of APIs), but where you can once > again assume POSIX semantics of 1 character per unit, simplifying life > of callers at the expense of converting to the new type
In the first case we wouldn't need a new type. The plan is the second alternative. The goal is *not* to have to extend each of quotearg.c, regcomp.c, mbchar.h, wc.c, etc. to handle UTF-16 explicitly with #ifdefs, more variables, and more logic. > if it works out, should we also add wwchar_t natively into cygwin? More and more Unix platforms offer only UTF-8 locales. One can predict that in 10 years, all Unix platforms will offer only UTF-8 locales. At this point wchar_t will be UCS-4 on all these platforms (except AIX). The mbrtoc32 function from the C1X API that you pointed to will then be equivalent to mbrtowwc. So, you can view 'wwchar_t' as a temporary measure that will bridge the gap between the ANSI C Amd. 1 API and the C1X API. Bruno -- In memoriam Carl Friedrich Goerdeler <http://en.wikipedia.org/wiki/Carl_Friedrich_Goerdeler> -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple