Hi Corinna, On Thu, 5 Dec 2024 11:51:17 +0100 Corinna Vinschen wrote: > Hi Takashi, > > On Dec 5 10:14, Takashi Yano wrote: > > Hi Corinna, > > > > On Wed, 4 Dec 2024 13:54:47 +0100 > > Corinna Vinschen wrote: > > > From: Corinna Vinschen <cori...@vinschen.de> > > > > > > Commit 0b6fbd396ca2f ("* exceptions.cc (_cygtls::interrupt_now): Revert > > > to checking for "spinning" when choosing to defer signal.") introduced > > > a bug in the loop inside the stabilize_sig_stack subroutine: > > > > > > First, stabilize_sig_stack grabs the stacklock. The _cygtls::incyg > > > flag is then incremented before checking if a signal has to be handled > > > for the current thread. > > > > > > If no signal waits, the code simply jumps out, decrements _cygtls::incyg > > > and returns to the caller, which eventually releases the stacklock. > > > > > > However, if a signal is waiting, stabilize_sig_stack releases the > > > stacklock, calls _cygtls::call_signal_handler(), and returns to > > > the start of the subroutine, trying to grab the lock. > > > > > > After grabbing the lock, it increments _cygtls::incyg... wait... > > > again? > > > > > > The loop does not decrement _cygtls::incyg after > > > _cygtls::call_signal_handler(), which returns with _cygtls::incyg > > > set to 1. So it increments incyg to 2. If no other signal is > > > waiting, stabilize_sig_stack jumps out and decrements _cygtls::incyg > > > to 1. Eventually, setjmp or longjmp both will return to user > > > code with _cygtls::incyg set to 1. This *may* be fixed at some later > > > point when signals arrive, but there will be a time when the application > > > runs in user code with broken signal handling. > > > > > > Fixes: 0b6fbd396ca2f ("* exceptions.cc (_cygtls::interrupt_now): Revert > > > to checking for "spinning" when choosing to defer signal.") > > > Signed-off-by: Corinna Vinschen <cori...@vinschen.de> > > > --- > > > winsup/cygwin/scripts/gendef | 1 + > > > 1 file changed, 1 insertion(+) > > > > > > diff --git a/winsup/cygwin/scripts/gendef b/winsup/cygwin/scripts/gendef > > > index 7e14f69cf71c..377ceb59b2c8 100755 > > > --- a/winsup/cygwin/scripts/gendef > > > +++ b/winsup/cygwin/scripts/gendef > > > @@ -344,6 +344,7 @@ stabilize_sig_stack: > > > movq \$_cygtls.start_offset,%rcx # point to beginning > > > addq %r12,%rcx # of tls block > > > call _ZN7_cygtls19call_signal_handlerEv > > > + decl _cygtls.incyg(%r12) > > > jmp 1b > > > 3: decl _cygtls.incyg(%r12) > > > addq \$0x20,%rsp > > > -- > > > 2.47.0 > > > > > > > I tested this patch with Christian's longjmp test case, but > > the problem does not seem to be fixed. > > That was not the intention. It was just something I found while looking > into the assembler code. This looks like a long-standing bug, which, > if my description above is right, might result in threads missing > signals, too. Or at least, signals being defered, because user-space > is running partially with the "iscyg" flag being set. > > It would be nice if you could check this, too, to be sure I'm not > totally wrong here.
It looks almost good to me and your commit message sounds reasonable. One question. There are three incyg <- 0 lines in gendef that are outside of stacklock. Is that safe? -- Takashi Yano <takashi.y...@nifty.ne.jp>