On Monday 27 January 2025 16:49:26 Lasse Collin wrote: > Another behavior difference happens with invalid multibyte strings. > I tested with UTF-8 in application manifest. A file named L"_\uFFFD_" > exists. > > The UCRT functions fail if given invalid UTF-8: > > fopen("_\x80_", "r"); > _open("_\x80_", O_RDONLY); > // _findfirst fails too > > GetLastError() returns ERROR_NO_UNICODE_TRANSLATION. > > Win32 API functions convert the invalid bytes to U+FFFD and then access > the resulting filename, so these succeed: > > GetFileAttributesA("_\x80_"); > > WIN32_FIND_DATAA wfd; > FindFirstFileA("_\x80_", &wfd); > // wfd.cFileName contains "_\uFFFD_" in UTF-8. > > Listing files in a directory works too, that is, > FindFirstFileA("_\x80_directory\\*", &wfd) lists files in > "_\ufffd_directory". > > I suppose dirent should follow the UCRT behavior.
I agree with you. Autoconverting of 0x0080 to 0xFFFD is a bad idea. > This means using MB_ERR_INVALID_CHARS with MultiByteToWideChar(). > > * * * > > It was pointed out that using FindFirstFileExW() can improve speed if > one tells it to not list 8.3 names. I didn't see a difference on SSD > (or well, actually cached data in RAM). But 8.3 names are needed if > there was _readdir_8dot3() which would fall back to the 8.3 name if > conversion of the long name fails. I suppose it's a more sensible > fallback for some apps than imaginary names from best-fit mapping. > > -- > Lasse Collin I think that for excluding 8.3 names you mean to use FindExInfoBasic level instead of FindExInfoStandard when doing FindFirstFileExW(). Level FindExInfoBasic is supported since Windows 7 and I think that readdir() could be still useful also on Windows XP. _______________________________________________ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public