Hi,

I've been hunting an issue for some days now, where a non-cygwin program using microsoft's UCRT sometimes end up with a sticky error on stdout when running under cygwin perl with a pipe capturing stdout and stderr.  When the problem triggers, the pipe buffer appears to be full and it really looks like it's hitting the errno=ENOSPC/doserrno=0 situation at the tail end of _write_nolock() in ucrt/lowio/write.cpp.

I *think* the issue is that the write end of the pipe isn't configured to be synchronous.  In winsup/cygwin/fhandler/pipe.cc, the nt_create() function sets FILE_SYNCHRONOUS_IO_NONALERT when creating the _read_ end of the pipe using NtCreateNamedPipeFile, citing some C# program compatibility need.  But, the call to NtOpenFile below that opens the _write_ end of the pipe doesn't set it.  It does set the SYNCHRONIZE access right, but doesn't set the FILE_SYNCHRONOUS_IO_NONALERT flag (last parameter, is zero). This is akin to calling CreateFile with FILE_FLAG_OVERLAPPED, if I understand it correctly.


This lack of symmetry in the synchronization configuration between the two pipe ends looks like a bug to me, given the comment on read end saying "Set FILE_SYNCHRONOUS_IO_NONALERT flag so that native C# programs work with cygwin pipe".


I have found no way to enable the FILE_SYNCHRONOUS_IO_NONALERT behaviour on my end of the pipe after it has been created, unlike the ReadMode and CompletionMode (parameters 9 & 10 of the NtCreateNamedPipeFile call).  The non-cygwin program in question has had issues with the FILE_PIPE_COMPLETE_OPERATION (PIPE_NOWAIT) and cygwin in the past, but that was solvable in a manner similar to calling fhandler_pipe::set_pipe_non_blocking(true). I can't come up with a good workaround for this problem, short of switching to a non-cygwin perl setup (painful).

The problem shows up sporadically (a few times almost every day lately) on an up-to-date windows server 2022 with a slightly dated cygwin setup (I doubt that matters much as the code in question seems to be unchanged in git head). It's a continuous build job that triggers this, and I'm not too keen on building my own cygwin dlls (haven't done that for at least 15 years) and placing them on this server.  I've tried to reproduce it locally on a similarly beefy Windows 11 workstation, but haven't had any luck at all, so either windows server 2022 specific kernel regression  or just your regular heisenbug.

The kernel code will do very different serialization for the pipe object when neither of the FILE_SYNCHRONOUS_IO_ALERT/NONALERT the flags are set, from what I can tell, though I haven't dug too deep into things yet. My guess, though, is it that if multiple threads/processes writes to an almost full pipe buffer at the same time, (Nt)WriteFile may sometimes return before writing the whole/any buffer, and thus upset the  stupid UCRT code.  Just forcing the pipe buffer to run full doesn't trigger it by it self, from what I can tell.


Also, in the error path of the NtOpenFile call, GetLastError() is used instead of __seterrno_from_nt_status() or RtlNtStatusToDosError().

Kind Regards,
 bird.

--
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

Reply via email to