在 2025-01-17 00:38, Lasse Collin 写道:
NAME_MAX is a POSIX constant; Windows doesn't define it. For this reason, assume that NAME_MAX is only about filenames in multibyte representation. In the UTF-8 code page, filenames can be up to 255 * 3 bytes excluding the terminating null character. (It's not 255 * 4 because four-byte UTF-8 characters consume two UTF-16 code units.)If I have understood correctly, there is no Windows locale that supports a code page with longer encodings. For example, a single UTF-16 code unit may produce four bytes in GB18030 but it cannot be used as a locale code page.
It seems so.Although it is possible to pass GB 18030 (code page 54936) to `WideCharToMultiByte()`, Windows in Simplified Chinese uses GBK by default, which itself is a extension to GB/T 2312-1980 and reuses its identifier (code page 936).
While GBK includes many traditional Chinese and Japanese characters, it does not seem to support four-byte characters in GB 18030:
UCRT64 ~/Desktop/t
$ cat find_files.c
#define WIN32_LEAN_AND_MEAN 1
#include <windows.h>
#include <stdio.h>
int
main(void)
{
printf("active code page = %d\n", GetACP());
WIN32_FIND_DATAA file;
HANDLE h = FindFirstFileA("*", &file);
if(h != INVALID_HANDLE_VALUE) {
do {
int n = lstrlenA(file.cFileName);
printf("found '%s': %d byte(s):", file.cFileName, n);
for(int i = 0; i < n; ++i)
printf(" %.2hhx", file.cFileName[i]);
printf("\n");
}
while(FindNextFileA(h, &file));
FindClose(h);
}
}
UCRT64 ~/Desktop/t
$ touch $'\uFFFF' # four bytes in GB 18030: 84 31 a4 39
UCRT64 ~/Desktop/t
$ touch '测试文件'
UCRT64 ~/Desktop/t
$ touch '測試檔案'
UCRT64 ~/Desktop/t
$ touch 'テストファイル'
UCRT64 ~/Desktop/t
$ ls -l
total 1
-rw-r--r-- 1 lh_mouse lh_mouse 0 Jan 18 15:25 ''$'\357\277\277'
-rw-r--r-- 1 lh_mouse lh_mouse 542 Jan 18 15:23 find_files.c
-rw-r--r-- 1 lh_mouse lh_mouse 0 Jan 18 15:25 テストファイル
-rw-r--r-- 1 lh_mouse lh_mouse 0 Jan 18 15:25 测试文件
-rw-r--r-- 1 lh_mouse lh_mouse 0 Jan 18 15:25 測試檔案
UCRT64 ~/Desktop/t
$ gcc find_files.c -o find_files.exe -Wall -Wextra
UCRT64 ~/Desktop/t
$ ./find_files.exe
active code page = 936
found '.': 1 byte(s): 2e
found '..': 2 byte(s): 2e 2e
found 'find_files.c': 12 byte(s): 66 69 6e 64 5f 66 69 6c 65 73 2e 63
found 'find_files.exe': 14 byte(s): 66 69 6e 64 5f 66 69 6c 65 73 2e 65 78 65
found 'テストファイル': 14 byte(s): a5 c6 a5 b9 a5 c8 a5 d5 a5 a1 a5 a4 a5 eb
found '测试文件': 8 byte(s): b2 e2 ca d4 ce c4 bc fe
found '測試檔案': 8 byte(s): 9c 79 d4 87 99 6e b0 b8
found '?': 1 byte(s): 3f
--
Best regards,
LIU Hao
OpenPGP_signature.asc
Description: OpenPGP digital signature
_______________________________________________ Mingw-w64-public mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
