在 2025-8-9 10:09, Kirill Makurin 写道:
To implement second part for UCRT, we would need to call these functions when converting from SBCS or DBCS code pages. However, we cannot use these functions to perform conversion from/to UTF-8. Some time ago, I have mentioned in another thread that return value of CRT's `mbrlen` and `mbrtowc` is correct when converting from UTF-8. This means we can call the replacements when MB_CUR_MAX <= 2 (SBCS and DBCS code pages), and call CRT's versions otherwise (I think it is safe to assume UTF-8 in case MB_CUR_MAX > 2).
While the default code page for Simplified Chinese is 936 (GBK, extended from GB 2312), it's possible to configure GB 18030 by calling `setlocale(LC_ALL, ".54936")`. The difference is that `MB_CUR_MAX` is 2 in GBK, but is 4 in GB 18030.
When I tried this one with an ancient Microsoft compiler from Windows Server 2003 SP1 Platform SDK -- which btw can still be downloaded for free [1] -- this seemed to work mostly: There was no name for this locale, but the string that was returned by `setlocale()` indicated it was using code page 54936 for everything, and `MB_CUR_MAX` yielded 4. I didn't check whether conversion routines would work correctly.
However, using newest Microsoft compiler with UCRT, `setlocale(LC_ALL, ".54936")` returns a null pointer, as if it had failed. Despite that, after the call, `MB_CUR_MAX` yields 4, and further attempts to query the current locale with `setlocale(LC_ALL, NULL)` also return a null pointer.
Giving the complexity of a reimplementation, and the fact that both MSVCRT and UCRT have bugs about these functions, it's a little questionable whether we should replace them or maintain bug-to-bug compatibility with UCRT.
[1] https://www.microsoft.com/en-us/download/details.aspx?id=15656
I am not sure what would be a clean way to do it. We need to call `LoadLibrary` to obtain UCRT's handle and call `GetProcAddress` to get address of CRTs versions so we can call them conditionally. Would it be appropriate to define a function with `attribute(constructor)` to achieve this or is there a better approach? Also, would it be safe to call `LoadLibrary` from such function?
See 'msvcrt_or_emu_glue.h' about how others are done, but it's really only for MSVCRT.DLL because the DLL name is hard-coded.
-- Best regards, LIU Hao
OpenPGP_signature.asc
Description: OpenPGP digital signature
_______________________________________________ Mingw-w64-public mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
