On 2025-02-18 Pali Rohár wrote: > On Tuesday 18 February 2025 23:32:54 Lasse Collin wrote: > > On 2025-02-18 Pali Rohár wrote: > > > Just one test case, can you check that your new readdir() > > > function is working correctly on these two paths? > > > > > > \\?\GLOBALROOT\Device\Harddisk0\Partition1\ > > > \\?\GLOBALROOT\Device\HardiskVolume1\ > > > > These paths don't work with the old dirent. opendir fails with > > ENOENT. > > Perfect, this is then nice improvement, that in new version it is > working.
I had made a mistake. I had tested the new code with and without \ at the end, but I had tested the old code only without. The old code does work when there is \ at the end. I hope it is OK that the new code works without \ too, even though I guess it's not strictly correct. Otherwise the GetFileAttributes call from the old code needs to be restored to the new version. A few other tiny things: (1) I tested on a directory that has an unsupported reparse tag. FindFirstFileW fails with ERROR_CANT_ACCESS_FILE (1920) which currently becomes EIO. The old dirent code fails with EINVAL at readdir (not at opendir). I guess EIO isn't the best. Directory symlinks and junctions whose targets don't exist make opendir fail with ENOENT, so I guess it's appropriate here too. A non-directory with an unsupported reparse tag or AF_UNIX were already ENOTDIR. (2) ERROR_CANT_RESOLVE_FILENAME (1921) is currently mapped to ELOOP. The error 1921 is possible in situations other than symlink loops too, for example, a junction with weirdly broken substitute path. stat() uses ENOENT in these situations. open() uses EINVAL (if the reparse point isn't a directory). I suppose EINVAL is a generic fallback value in MS CRTs, because EINVAL seems to occur with so many types of errors. MSVCRT's strerror(ELOOP) returns "Unknown error". UCRT has a proper message for ELOOP. I'm unsure which is better, ELOOP or ENOENT. Probably it doesn't matter much in practice. (3) I found old Microsoft docs on the web which, if they can trusted, say that WC_NO_BEST_FIT_CHARS isn't available on Win95 and NT4. So in the current form, the new dirent code requires Windows 2000 or later. From earlier discussions I got an impression that as long as it works on WinXP it's good enough, so I only updated the comments. https://www.tenouk.com/ModuleG.html (4) WideCharToMultiByte docs say that with CP_UTF8 the only supported flag is WC_ERR_INVALID_CHARS and the last argument must be NULL. It's true on Win7, but on recent Win10 it works. It's logical because that combination works with CP_ACP when ACP is UTF-8. This feature seems to be undocumented, so it's still best to not take advantage of it. (5) In setlocale docs section "UTF-8 support"[1], the last paragraph says that UTF-8 locales are possible on Windows versions older than 10 with app-local deployment or static linking of UCRT. I hope this is irrelevant in mingw-w64 context. In [2] section "Central deployment", the UCRT versions listed for pre-Win10 are too old to support UTF-8 locales. The last UCRT redistributable for WinXP has 10.0.10586.15. WinXP doesn't support WC_ERR_INVALID_CHARS (Vista does). If someone managed to use a new enough UCRT on WinXP *and* use a UTF-8 locale, then the new dirent code doesn't work. [1] https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setlocale-wsetlocale?view=msvc-170#utf-8-support [2] https://learn.microsoft.com/en-us/cpp/windows/universal-crt-deployment?view=msvc-170#central-deployment I attached a patch that adds ERROR_CANT_ACCESS_FILE (1920) and tweaks a few comments. I didn't change ELOOP. If nothing above made you think that something else should be changed, then this should finally be the final version. :-) Thanks! -- Lasse Collin
From dddeeb3d77884970a037856934554097290711cf Mon Sep 17 00:00:00 2001 From: Lasse Collin <lasse.col...@tukaani.org> Date: Sat, 22 Feb 2025 15:00:55 +0200 Subject: [PATCH] ... Handle ERROR_CANT_ACCESS_FILE --- mingw-w64-crt/misc/dirent.c | 15 +++++++++++---- mingw-w64-headers/crt/dirent.h | 3 ++- 2 files changed, 13 insertions(+), 5 deletions(-) diff --git a/mingw-w64-crt/misc/dirent.c b/mingw-w64-crt/misc/dirent.c index 3faca481b..c9fbcef3e 100644 --- a/mingw-w64-crt/misc/dirent.c +++ b/mingw-w64-crt/misc/dirent.c @@ -23,7 +23,7 @@ * - added d_type to struct dirent and struct _wdirent * - improved error handling * - added API docs into dirent.h - * - Windows 95/98/ME is no longer supported + * - Windows 95/98/ME and NT4 are no longer supported */ #ifndef WIN32_LEAN_AND_MEAN @@ -239,6 +239,7 @@ _wopendir (const wchar_t *path) case ERROR_BAD_PATHNAME: case ERROR_BAD_NETPATH: case ERROR_BAD_NET_NAME: + case ERROR_CANT_ACCESS_FILE: /* In addition to the obvious reason, ERROR_PATH_NOT_FOUND * may occur also if the search pattern is too long: * 32767 wide chars including the \0 for a long path aware app, @@ -255,7 +256,11 @@ _wopendir (const wchar_t *path) * or if the server doesn't support file sharing. * * ERROR_BAD_NET_NAME occurs if the server can be contacted but - * the share doesn't exist. */ + * the share doesn't exist. + * + * ERROR_CANT_ACCESS_FILE occurs with directories that have + * an unhandled reparse point tag. Treat them the same way as + * directory symlinks and junctions whose targets don't exist. */ err = ENOENT; break; @@ -464,6 +469,7 @@ prepare_next_entry (DIR *dirp) case ERROR_BAD_PATHNAME: case ERROR_BAD_NETPATH: case ERROR_BAD_NET_NAME: + case ERROR_CANT_ACCESS_FILE: case ERROR_DIRECTORY: case ERROR_INVALID_FUNCTION: case ERROR_NOT_FOUND: @@ -591,13 +597,14 @@ readdir_impl (DIR *dirp, BOOL fallback8dot3) * * - CP_ACP and CP_OEMCP support WC_NO_BEST_FIT_CHARS even when those * code pages are set to UTF-8. Lossy conversion is detected via the - * last argument (BOOL*). + * last argument (BOOL*). This works on Windows 2000 and later. On + * Windows 10, this may work with CP_UTF8 too, but it's undocumented. * * - CP_UTF8 requires WC_ERR_INVALID_CHARS, and the last argument must be * NULL. If the filename contains unpaired surrogates (invalid UTF-16), * the return value will be 0. WC_ERR_INVALID_CHARS only works on * Windows Vista and later, but CP_UTF8 is only used with UTF-8 locales - * which are only supported on Windows 10 and later. + * which are only supported with new enough UCRT. * * d_name is big enough that conversion cannot run out of buffer space * with double-byte character sets or UTF-8. diff --git a/mingw-w64-headers/crt/dirent.h b/mingw-w64-headers/crt/dirent.h index 7a27f725c..a6f9aeee3 100644 --- a/mingw-w64-headers/crt/dirent.h +++ b/mingw-w64-headers/crt/dirent.h @@ -85,7 +85,8 @@ typedef struct __dirent_DIR DIR; * Windows reports as ERROR_CANT_RESOLVE_FILENAME. * EACCES Access denied. * EIO Unknown error, possibly an I/O error. - * ENOSYS This dirent implementation doesn't work on Windows 95/98/ME. + * ENOSYS This dirent implementation works on Windows 2000 and later. + * Windows 95/98/ME and NT4 are not supported. */ DIR* __cdecl __MINGW_NOTHROW opendir (const char*); -- 2.48.1
_______________________________________________ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public