On Mon, 20 Jan 2025 12:38:24 +0100 Corinna Vinschen wrote: > On Jan 20 18:08, Takashi Yano wrote: > > On Mon, 20 Jan 2025 00:33:26 +0900 > > Takashi Yano wrote: > > > On Sun, 19 Jan 2025 09:40:14 +0900 > > > Takashi Yano wrote: > > > > However, I wonder if cw_timer is re-set by NtSetTimer() in the > > > > cygwait(), it will be set to WSSC (60 sec) (or 10msec) in the > > > > sig_send(), so the hang should end with in at most 60 sec unlike > > > > the the hang Jeremy reported. > > > > > > > > I should still overlook something. > > > > > > Yes, I did. > > > cygwait() calls NtCancelTimer() on return. So, cw_timer will be > > > never signalled after recursive cygwait() call. Therefore, > > > L358: waitret = cygwait (select_sem, select_sem_timeout); > > > will return only when select_sem is signalled though it is expected > > > that cygwait() at L358 spends 1msec at most. This is most likely > > > the reason of the hang at L358 in fhandler_pipe::raw_read(). > > > > > > The conclusion is: > > > Do not use cygwait() in sig_send(). > > > 0003-Cygwin-signal-Do-not-handle-signal-when-__SIGFLUSHFA.patch > > > is the right thing while > > > ng-0003-Cygwin-signal-Do-not-handle-signal-when-__SIGFLUSHFA.patch > > > and previous v2 __SIGFLUSHFAST patch > > > are not. > > > > I tried adding NtSetTimer() immediately after call_signal_handler() > > in cygwait() to easily make it reentrant. The result is, > > the hang in repeated cygwin1.dll build no longer happen even with > > v2 __SIGFLUSHFAST patch. > > > > It seems that making cygwait() reentrant is an alternative idea. > > The TLS cw_timer was introduced in f0968c1e7eda ("* cygtls.h (struct > _local_storage): Add cw_timer member.") to avoid having to create and > destroy a timer on each invocation of cygwait, which would occur pretty > often in some scenarios. > > cw_timer was then recycled for select() by commit 7186b657e74b ("Improve > timer handling in select.") for the same reason. Select may even be the > worse problem. > > And the problem doesn't go away of course, so it would be nice if we > could keep using cw_timer in most cases in cygwait() and only fall back > to another timer in case we use it in signal handling reentrantly. > > It might have been a good idea to use different timers in cygwait() > and select(), which could use it's own "sel_timer" or something, > but that doesn't fix the signal handler reentrancy thingy. > > So what about redefining cygwait to take a fourth parameter and > use that if it's != NULL? The caller can then just create its own > timer. Kind of like this: > > diff --git a/winsup/cygwin/cygwait.cc b/winsup/cygwin/cygwait.cc > index 80c0e971c77d..b54d6ae6f8a4 100644 > --- a/winsup/cygwin/cygwait.cc > +++ b/winsup/cygwin/cygwait.cc > @@ -24,10 +24,11 @@ > LARGE_INTEGER cw_nowait_storage; > > DWORD > -cygwait (HANDLE object, PLARGE_INTEGER timeout, unsigned mask) > +cygwait (HANDLE object, PLARGE_INTEGER timeout, unsigned mask, HANDLE timer) > { > DWORD res; > DWORD num = 0; > + HANDLE &wait_timer = timer ? timer : _my_tls.locals.cw_timer; > HANDLE wait_objects[4]; > pthread_t thread = pthread::self (); > > [use wait_timer in place of _my_tls.locals.cw_timer throughout cygwait] > diff --git a/winsup/cygwin/local_includes/cygwait.h > b/winsup/cygwin/local_includes/cygwait.h > index 6212c334e1b9..4eda290782c0 100644 > --- a/winsup/cygwin/local_includes/cygwait.h > +++ b/winsup/cygwin/local_includes/cygwait.h > @@ -27,8 +27,8 @@ extern LARGE_INTEGER cw_nowait_storage; > > const unsigned cw_std_mask = cw_cancel | cw_cancel_self | cw_sig; > > -DWORD cygwait (HANDLE, PLARGE_INTEGER timeout, > - unsigned = cw_std_mask); > +DWORD cygwait (HANDLE, PLARGE_INTEGER, > + unsigned = cw_std_mask, HANDLE = NULL); > > extern inline DWORD __attribute__ ((always_inline)) > cygwait (HANDLE h, DWORD howlong, unsigned mask)
Thanks for the nice idea. I'll submit v4 seriese patch. Please review. -- Takashi Yano <takashi.y...@nifty.ne.jp>