Eric, More details about this one: > > checking whether mbrtowc has a correct return value... no > > This is the first one the Cygwin developers should take care of. > The test case is in m4/mbrtowc.m4 lines 260..296. > > > ../../gltests/test-wcrtomb.c:51: assertion failed > > ../../gltests/test-wcrtomb.sh: line 25: 9880 Aborted (core > > dumped) LC_ALL=$LOCALE_JA ./test-wcrtomb${EXEEXT} 3 > > FAIL: test-wcrtomb.sh > > Looks like a problem with Cygwin's EUC-JP decoder. > > > ../../gltests/test-mbrtowc.c:240: assertion failed > > ../../gltests/test-mbrtowc3.sh: line 15: 19980 Aborted > > (core > > dumped) LC_ALL=$LOCALE_JA ./test-mbrtowc${EXEEXT} 3 > > FAIL: test-mbrtowc3.sh > > Likewise.
The setlocale function [1] installs function pointers to functions __eucjp_wctomb and __eucjp_mbtowc, which are implemented in [2] and [3]. A second bug in Cygwin's EUC-JP decoder is that in [3] there is a comment "Cygwin defines its own doublebyte charset conversion functions because the underlying OS requires wchar_t == UTF-16." (and yes indeed, on Windows, the system calls expect UTF-16 encoded content in wchar_t* strings), but the __eucjp_wctomb and __eucjp_mbtowc functions return 16 bits combined from the two 8-bit values of the multibyte character - without any conversion between JISX0208/JISX0212 and Unicode. These functions need to be completely rewritten. Bruno [1] http://cygwin.com/cgi-bin/cvsweb.cgi/src/newlib/libc/locale/locale.c?rev=1.22&content-type=text/x-cvsweb-markup&cvsroot=src [2] http://cygwin.com/cgi-bin/cvsweb.cgi/src/newlib/libc/stdlib/wctomb_r.c?rev=1.14&content-type=text/x-cvsweb-markup&cvsroot=src [3] http://cygwin.com/cgi-bin/cvsweb.cgi/src/newlib/libc/stdlib/mbtowc_r.c?rev=1.13&content-type=text/x-cvsweb-markup&cvsroot=src