在 2025-8-9 10:09, Kirill Makurin 写道:
To implement second part for UCRT, we would need to call these functions when 
converting from SBCS or DBCS code pages. However, we cannot use these functions to 
perform conversion from/to UTF-8. Some time ago, I have mentioned in another thread 
that return value of CRT's `mbrlen` and `mbrtowc` is correct when converting from 
UTF-8. This means we can call the replacements when MB_CUR_MAX <= 2 (SBCS and DBCS 
code pages), and call CRT's versions otherwise (I think it is safe to assume UTF-8 in 
case MB_CUR_MAX > 2).

While the default code page for Simplified Chinese is 936 (GBK, extended from GB 2312), it's possible to configure GB 18030 by calling `setlocale(LC_ALL, ".54936")`. The difference is that `MB_CUR_MAX` is 2 in GBK, but is 4 in GB 18030.

When I tried this one with an ancient Microsoft compiler from Windows Server 2003 SP1 Platform SDK -- which btw can still be downloaded for free [1] -- this seemed to work mostly: There was no name for this locale, but the string that was returned by `setlocale()` indicated it was using code page 54936 for everything, and `MB_CUR_MAX` yielded 4. I didn't check whether conversion routines would work correctly.

However, using newest Microsoft compiler with UCRT, `setlocale(LC_ALL, ".54936")` returns a null pointer, as if it had failed. Despite that, after the call, `MB_CUR_MAX` yields 4, and further attempts to query the current locale with `setlocale(LC_ALL, NULL)` also return a null pointer.

Giving the complexity of a reimplementation, and the fact that both MSVCRT and UCRT have bugs about these functions, it's a little questionable whether we should replace them or maintain bug-to-bug compatibility with UCRT.


[1] https://www.microsoft.com/en-us/download/details.aspx?id=15656



I am not sure what would be a clean way to do it. We need to call `LoadLibrary` 
to obtain UCRT's handle and call `GetProcAddress` to get address of CRTs 
versions so we can call them conditionally. Would it be appropriate to define a 
function with `attribute(constructor)` to achieve this or is there a better 
approach? Also, would it be safe to call `LoadLibrary` from such function?

See 'msvcrt_or_emu_glue.h' about how others are done, but it's really only for MSVCRT.DLL because the DLL name is hard-coded.



--
Best regards,
LIU Hao

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

_______________________________________________
Mingw-w64-public mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Reply via email to