While not about dirent, an old issue in stat() and _stat() is kind of related:
MSVCRT's _stat() fails with "directory/", but in POSIX a trailing slash should work for directories (and only for directories). mingw-w64-crt/stdio/_stat.c provides stat() that, with some exceptions, removes *one* trailing / or \ and then calls _stat(). This hack is often enough, but as a side effect it then makes "file/" succeed while it shouldn't. And "directory//" still fails while it shouldn't. On UCRT, stat() simply calls _stat(). UCRT's _stat() has eerily similar behavior with trailing slashes as mingw-w64's replacement stat() has in MSVCRT builds. It's as if UCRT was made bug-for-bug compatible with mingw-w64 instead of thinking what is the correct but more complex thing to do. MSVCRT's _stat() doesn't work on \\.\ or \\?\ paths. UCRT's does, but the buggy trailing slash removal means that only the last line below succeeds with UCRT's _stat(): \\?\GLOBALROOT\Device\Harddisk0\Partition3 \\?\GLOBALROOT\Device\Harddisk0\Partition3\ \\?\GLOBALROOT\Device\Harddisk0\Partition3\\ MSVCRT's _stat() doesn't follow reparse points. UCRT's _stat() does (on symlink or junction loops it fails with ENOENT, not ELOOP). On 2025-02-23 Lasse Collin wrote: > On 2025-02-23 Pali Rohár wrote: > > On 2025-02-23 Lasse Collin wrote: > > > The fact that something else does it wrong doesn't mean that it's > > > necessarily fine to do so. Mixing these path types would be very > > > bad in stat() or such function. To me it doesn't feel too bad when > > > one is specifically trying to open a directory to list its > > > contents (like opendir or Python's os.listdir). > > > > > > Having said that, I will try the GetFileAttributesW solution > > > still. Maybe it has some other minor benefits in error detection > > > (perhaps it allows shortening the checked error list of > > > FindFirstFileW). > > > > This could improve error handling. > > It should catch currently unhandled error codes as ENOENT. Then the > error cases of FindFirstFileW don't need to handle quite so many > errors like those from \\.\nul (ERROR_INVALID_FUNCTION) and \\.\con > (ERROR_NOT_FOUND). I suppose four error codes resulting in ENOENT can > be omitted too since those shouldn't occur if GetFileAttributesW > succeeds (unless the target changes due to a race condition). > > GetFileAttributesW docs has the following in "Remarks": > > If you call GetFileAttributes for a network share, the function > fails, and GetLastError returns ERROR_BAD_NETPATH. You must > specify a path to a subfolder on that share. > > But it works still. Maybe it doesn't in some old Windows version? > > Calling GetFileAttributesW early in _wopendir (before > GetFullPathNameW) made simple recursive directory scanning 20 % > slower. The program only goes through the tree counting the entries, > not even printing them. In real world apps the effect would be > smaller. It's still dramatically faster than with the old dirent when > taking advatange of d_type. While \\?\...\Partition3 without trailing \ isn't a directory, I more and more think that it doesn't matter too much if opendir works on it still. It's so tiny bug that *it alone* isn't worth the the extra code or the runtime overhead of GetFileAttributes. (With stat(), this kind of cheating wouldn't be acceptable.) Other considerations about using or not using GetFileAttributes: (1) If opendir is called with a weird device path, can something bad happen if FindFirstFileW is called on that path with \* at the end? If yes, can GetFileAttributes call without the \* at the end have similar bad effects? If yes, then worrying about \* seems pointless. I hope this is far in the over-thinking territory. (2) Error detection differences: - To map "nul" and "con" to ENOTDIR, current FindFirstFileW-only method handles ERROR_INVALID_FUNCTION and ERROR_NOT_FOUND. With GetFileAttributes these two wouldn't be needed. - "symlink-loop/foo" can only be detected with FindFirstFileW: FindFirstFileW => ERROR_CANT_RESOLVE_FILENAME GetFileAttributes => ERROR_FILE_NOT_FOUND - Since GetFileAttributes doesn't follow a reparse point when it's the last pathname component, "foo/symlink-loop" needs to be handled in FindFirstFileW error codes still. So does EACCES for the last pathname component. - The old dirent fails with ENOENT whenever GetFileAttributes fails. More precise errno values would require calling GetLastError and deciding what errno value to use for unknown errors. This is not different from the FindFirstFileW-only method. On the other hand, if ENOENT is OK as the default error, then the FindFirstFileW method could be changed to that. I think (2) doesn't favor the use of GetFileAttributes. I really hope (1) doesn't either. > > > > > An alternative would be to call GetLastError() == > > > > > ERROR_INVALID_FLAGS if WideCharToMultiByte fails and retry > > > > > with flags = 0. > > > > > > > > That is too complicated. > > > > > > It's not much more code. I have attached a patch to do that. On > > > Win95, GetLastError returns ERROR_INVALID_FLAGS (1004), so I > > > assume that this method works on NT4 too. > > > > Uff, that is questionable. Win9x was always different and not > > followed the true NT based WinAPI. > > ERROR_INVALID_FLAGS is mentioned in WideCharToMultiByte docs, > including the old ones. That error code is used on WinXP when trying > CP_UTF8 with WC_ERR_INVALID_CHARS. To be very sure, someone would > need to test it on NT4. WideCharToMultiByte indeed fails with ERROR_INVALID_FLAGS (1004) on NT4. The GetLastError method works in dirent: best-fit mapping is used on NT4. About errno with unknown errors: - EIO sounds good if FindNextFileW fails. Even if it isn't a hardware error, that function should rarely fail. - I have started to think that EIO might be a bit harsh in _wopendir. I wonder if EINVAL would be better. MS CRTs seem to use EINVAL as a fallback value (or at least EINVAL is very common if one tries to do something weird). Another alternative is defaulting to ENOENT like the old dirent does on GetFileAttributes failure. The attached patches fix another silly copypaste error and simplify the second FindFirstFileW error check to use ENOENT even for unknown errors (that code is used only after rewinddir or seekdir). _wopendir's errno values weren't changed. -- Lasse Collin
From 50dc61b90df88a857a8aa02cf05b6ad73474a41e Mon Sep 17 00:00:00 2001 From: Lasse Collin <lasse.col...@tukaani.org> Date: Mon, 24 Feb 2025 18:45:12 +0200 Subject: [PATCH 1/2] ... WIP another typo fix --- mingw-w64-crt/misc/dirent.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mingw-w64-crt/misc/dirent.c b/mingw-w64-crt/misc/dirent.c index 99aaa9d92..7f63ad530 100644 --- a/mingw-w64-crt/misc/dirent.c +++ b/mingw-w64-crt/misc/dirent.c @@ -343,7 +343,7 @@ _wopendir (const wchar_t *path) case ERROR_CALL_NOT_IMPLEMENTED: /* We might get here if GetFullPathNameW wasn't called. */ err = ENOSYS; - return NULL; + break; default: /* Unknown error. */ -- 2.48.1
From 1415cd8498b4229058945ab7d8cbca6a171a97a0 Mon Sep 17 00:00:00 2001 From: Lasse Collin <lasse.col...@tukaani.org> Date: Tue, 25 Feb 2025 14:45:38 +0200 Subject: [PATCH 2/2] ... Simplify FindFirstFileW error checks after rewinddir and seekdir --- mingw-w64-crt/misc/dirent.c | 22 ++++------------------ 1 file changed, 4 insertions(+), 18 deletions(-) diff --git a/mingw-w64-crt/misc/dirent.c b/mingw-w64-crt/misc/dirent.c index 7f63ad530..77c18323e 100644 --- a/mingw-w64-crt/misc/dirent.c +++ b/mingw-w64-crt/misc/dirent.c @@ -501,18 +501,10 @@ prepare_next_entry (DIR *dirp) * seekdir doesn't reset it to -1. */ return dirp->dd_endval; - case ERROR_PATH_NOT_FOUND: - case ERROR_INVALID_NAME: - case ERROR_BAD_PATHNAME: - case ERROR_BAD_NETPATH: - case ERROR_BAD_NET_NAME: - case ERROR_CANT_ACCESS_FILE: - case ERROR_DIRECTORY: - case ERROR_INVALID_FUNCTION: - case ERROR_NOT_FOUND: - case ERROR_CANT_RESOLVE_FILENAME: - case ERROR_ACCESS_DENIED: - case ERROR_FILENAME_EXCED_RANGE: + case ERROR_NOT_ENOUGH_MEMORY: + return ENOMEM; + + default: /* On POSIX systems, rewinddir or seekdir shouldn't fail * since they can keep the directory open. On Windows, we * have to restart the directory reading, and if the @@ -520,12 +512,6 @@ prepare_next_entry (DIR *dirp) * changed, this can fail in various ways. Treat all these * errors as an invalid position of the directory stream. */ return ENOENT; - - case ERROR_NOT_ENOUGH_MEMORY: - return ENOMEM; - - default: - return EIO; } } -- 2.48.1
_______________________________________________ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public