On Sun, Nov 24, 2024 at 8:32 AM Cedric Blancher <cedric.blanc...@gmail.com> wrote: > > On Sat, 23 Nov 2024 at 17:47, Jeremy Drake <cyg...@jdrake.com> wrote: > > > > On Sat, 23 Nov 2024, Cedric Blancher via Cygwin wrote: > > > > > Good afternoon! > > > > > > Does Cygwin do a silly rename if a Cygwin file is open but gets > > > /bin/rm at the same time? > > > > Yes! See function try_to_bin in winsup/cygwin/syscalls.cc: > > /* Create unique filename. Start with a dot, followed by "cyg" > > transposed into the Unicode low surrogate area (U+dc00) on file > > systems supporting Unicode (except Samba), followed by the inode > > number in hex, followed by a path hash in hex. The combination > > allows to remove multiple hardlinks to the same file. */ > > That code is wrong. > > bash -c 'printf ".\udc63\udc79\udc67#\n"' | iconv -f UTF-8 > .iconv: illegal input sequence at position 1 > > 334 RtlAppendUnicodeToString (&recycler, > 335 (pc.fs_flags () & FILE_UNICODE_ON_DISK > 336 && !pc.fs_is_samba ()) > 337 ? L".\xdc63\xdc79\xdc67" : L".cyg"); > > SAMBA is right to reject L".\xdc63\xdc79\xdc67", because it is not a > valid UTF-16 sequence. ReFS with validation, OpenZFS and so on will > all REJECT such file names, and neither can NFSv4 because file names > must be valid Unicode (even if nfsd would not validate then filesystem > being shared via nfsd will reject that). > So this can only work on ntfs, and only if it is not validating the > input UTF.16 sequence. > > AFAIK FILE_UNICODE_ON_DISK means that the wchar_t sequences must be > valid UTF-16, and not just be a random sequence of 16bit values. > > @Corinna Vinschen Could this sequence please be changed to a VALID > UTF-8 sequence, such as \u[fffc]\u[fffc]\u[fffc]? That might work with > SAMBA, ReFS, OpenZFS NFSv4, ...
That does not help with existing Cygwin installations and Cygwin 32bit, which is stuck at Cygwin 3.3.x ... ;-( I agree that the L".\xdc63\xdc79\xdc67" prefix will backfire on something like ReFS, OpenZFS etc (SAMBA uses the prefix for filesystems which do NOT have |FILE_UNICODE_ON_DISK| set), but for ms-nfs41-client I just stomp over the issues with this patch (wording still needs to be improved): ---- snip ---- diff --git a/daemon/setattr.c b/daemon/setattr.c index 9eaafb5..6e9729e 100644 --- a/daemon/setattr.c +++ b/daemon/setattr.c @@ -284,6 +284,46 @@ static int handle_nfs41_rename(void *daemon_context, setattr_upcall_args *args) EASSERT((rename->FileNameLength%sizeof(WCHAR)) == 0); +#define CYGWIN_STOMP_SILLY_RENAME_INVALID_UTF16_SEQUENCE 1 + +#ifdef CYGWIN_STOMP_SILLY_RENAME_INVALID_UTF16_SEQUENCE + /* + * Stomp Cygwin "silly rename" invalid Unicode sequence + * + * Cygwin has it's own variation of "silly rename" (i.e. if + * someone deletes a file while someone else still has + * a valid fd to that file it first renames that file with a + * special prefix, see + * newlib-cygwin/winsup/cygwin/syscalls.cc, function + * |try_to_bin()|). + * + * Unfortunately on filesystems supporting Unicode + * (i.e. |FILE_UNICODE_ON_DISK|) Cygwin adds the prefix + * L".\xdc63\xdc79\xdc67", which is NOT a valid UTF-16 sequence, + * and will be rejected by a filesystem validating the + * UTF-16 sequence (e.g. SAMBA, ReFS, OpenZFS, ...). + * In our case the NFSv4.1 protocol requires valid UTF-8 + * sequences, and the NFS server will reject filenames if either + * the server or the exported filesystem will validate the UTF-8 + * sequence. + * + * Since Cygwin only does a |rename()| and never a lookup by + * that filename we just stomp the prefix with the prefix used + * for non-|FILE_UNICODE_ON_DISK| filesystems. + * We ignore the side-effects here, e.g. that Win32 will still + * "remember" the original filename in the file name cache. + */ + if ((rename->FileNameLength > (4*sizeof(wchar_t))) && + (!memcmp(rename->FileName, + L".\xdc63\xdc79\xdc67", (4*sizeof(wchar_t))))) { + DPRINTF(1, ("handle_nfs41_rename(args->path='%s'): " + "Cygwin sillyrename prefix \".\\xdc63\\xdc79\\xdc67\" " + "detected, squishing prefix to \".cyg\"\n", + args->path)); + (void)memcpy(rename->FileName, L".cyg", 4*sizeof(wchar_t)); + } +#endif /* CYGWIN_STOMP_SILLY_RENAME_INVALID_UTF16_SEQUENCE */ + dst_path.len = (unsigned short)WideCharToMultiByte(CP_UTF8, WC_ERR_INVALID_CHARS|WC_NO_BEST_FIT_CHARS, rename->FileName, rename->FileNameLength/sizeof(WCHAR), ---- snip ---- ---- Bye, Roland -- __ . . __ (o.\ \/ /.o) roland.ma...@nrubsig.org \__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer /O /==\ O\ TEL +49 641 3992797 (;O/ \/ \O;) -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple