On 2025-02-22 Pali Rohár wrote: > On Saturday 22 February 2025 18:12:36 Lasse Collin wrote: > > On 2025-02-22 Pali Rohár wrote: > > > On Saturday 22 February 2025 16:03:42 Lasse Collin wrote: > > > > On 2025-02-18 Pali Rohár wrote: > > > > > On Tuesday 18 February 2025 23:32:54 Lasse Collin wrote: > > > > > > On 2025-02-18 Pali Rohár wrote: > > > > > > > Just one test case, can you check that your new readdir() > > > > > > > function is working correctly on these two paths? > > > > > > > > > > > > > > \\?\GLOBALROOT\Device\Harddisk0\Partition1\ > > > > > > > \\?\GLOBALROOT\Device\HardiskVolume1\ > > > > > > > > > > > > These paths don't work with the old dirent. opendir fails > > > > > > with ENOENT. > > > > > > > > > > Perfect, this is then nice improvement, that in new version > > > > > it is working. > > > > > > > > I had made a mistake. I had tested the new code with and > > > > without \ at the end, but I had tested the old code only > > > > without. The old code does work when there is \ at the end. I > > > > hope it is OK that the new code works without \ too, even > > > > though I guess it's not strictly correct. > > > > > > Uff, I'm not sure. As without the trailing \ the above path is not > > > directory. > > > > Right, it's not a directory. > > > > The old dirent code calls GetFileAttributes on the path first and > > only then proceeds to append \* for _findfirst. The new code > > doesn't call GetFileAttributes at all, so FindFirstFileW is called > > with \* appended when there isn't a \ or / at the end already. > > > > If GetFileAttributes was called, maybe FindFirstFileW wouldn't need > > to check for as many Windows error codes, but one has to check for > > a few codes from GetAttributes still. Also, something in the file > > system can in theory change before FindFirstFileW is called (a race > > condition). > > > > Are there other situations where a path isn't a directory if there > > isn't a \ at the end? > > It can be anything within NT object system which represents some top > level filesystem object or NT object symlink to filesystem directory.
OK, thanks. It seems that a few other directory readers cheat with this detail: - Gnulib's opendir - In Cygwin, ls '\\?\C:' works even without \ at the end, but \\?\GLOBALROOT\... paths don't work in any form. - UCRT64 Python's os.listdir works with \\?\GLOBALROOT\... path with and without \ at the end. However, it rejects \\?\C: but \\?\C:\ works. At the same time, os.stat consistently treats these as block devices or directories depending on if there is a \ at the end. The fact that something else does it wrong doesn't mean that it's necessarily fine to do so. Mixing these path types would be very bad in stat() or such function. To me it doesn't feel too bad when one is specifically trying to open a directory to list its contents (like opendir or Python's os.listdir). Having said that, I will try the GetFileAttributesW solution still. Maybe it has some other minor benefits in error detection (perhaps it allows shortening the checked error list of FindFirstFileW). > > An alternative would be to call GetLastError() == > > ERROR_INVALID_FLAGS if WideCharToMultiByte fails and retry with > > flags = 0. > > That is too complicated. It's not much more code. I have attached a patch to do that. On Win95, GetLastError returns ERROR_INVALID_FLAGS (1004), so I assume that this method works on NT4 too. I have no real clue about UWP apps, but it seems that GetVersion isn't supported in those. Thus, if UWP compatibility matters, the GetLastError method should be compatible. AreFileApisANSI doesn't seem to be supported in UWP apps either. I suppose it could be added to winstorecompat, always returning 1. https://learn.microsoft.com/en-us/uwp/win32-and-com/win32-apis -- Lasse Collin
From e29a369b9b28074ba01e665ffcaabb35f48b4eef Mon Sep 17 00:00:00 2001 From: Lasse Collin <lasse.col...@tukaani.org> Date: Sun, 23 Feb 2025 15:46:33 +0200 Subject: [PATCH] ... Avoid GetVersion --- mingw-w64-crt/misc/dirent.c | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/mingw-w64-crt/misc/dirent.c b/mingw-w64-crt/misc/dirent.c index 8c32758b6..1f7a726ce 100644 --- a/mingw-w64-crt/misc/dirent.c +++ b/mingw-w64-crt/misc/dirent.c @@ -609,10 +609,8 @@ readdir_impl (DIR *dirp, BOOL fallback8dot3) * d_name is big enough that conversion cannot run out of buffer space * with double-byte character sets or UTF-8. */ - DWORD version = GetVersion () & 0xFF; unsigned int cp = get_code_page (); - DWORD flags = (cp == CP_UTF8) ? (version >= 6 ? WC_ERR_INVALID_CHARS : 0) - : (version >= 5 ? WC_NO_BEST_FIT_CHARS : 0); + DWORD flags = (cp == CP_UTF8) ? WC_ERR_INVALID_CHARS : WC_NO_BEST_FIT_CHARS; BOOL was_lossy = FALSE; BOOL *was_lossy_ptr = (cp == CP_UTF8) ? NULL : &was_lossy; @@ -628,6 +626,19 @@ readdir_impl (DIR *dirp, BOOL fallback8dot3) dirp->dd_entry.a.d_name, sizeof (dirp->dd_entry.a.d_name), NULL, was_lossy_ptr); + if (conv_result == 0 && GetLastError () == ERROR_INVALID_FLAGS) + { + /* Be compatible with WinXP (UTF-8 locale with app-local deployment + * of new enough UCRT) and NT4. In these cases, lossy conversion of + * some characters are allowed. */ + flags = 0; + conv_result = WideCharToMultiByte ( + cp, flags, + dirp->dd_wfd.cFileName, -1, + dirp->dd_entry.a.d_name, sizeof (dirp->dd_entry.a.d_name), + NULL, was_lossy_ptr); + } + /* Check for <= 1 instead of <= 0 to ensure that the filename * isn't an empty string. */ if (conv_result <= 1 || was_lossy) -- 2.48.1
_______________________________________________ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public