On Thu, 13 Mar 2025 20:42:52 +0900 Takashi Yano wrote: > Hi Corinna, > > On Thu, 13 Mar 2025 10:40:48 +0100 > Christian Franke wrote: > > Corinna Vinschen via Cygwin wrote: > > > On Mar 12 17:06, Corinna Vinschen via Cygwin wrote: > > >> On Mar 12 16:30, Corinna Vinschen via Cygwin wrote: > > >>> On Mar 11 12:32, Christian Franke via Cygwin wrote: > > >>>> The attached testcase should test the following use cases of > > >>>> setcontext: > > >>>> - call from regular user space > > >>>> - call from a signal handler interrupting user space > > >>>> - call from a signal handler interrupting a system call > > >>>> > > >>>> It works as expected ... until the signal count reaches 256. Then > > >>>> signals > > >>>> are again only delivered from inside of a system call. > > >>>> [...] > > >>>> Interesting... Hmm... is there some 8-bit counter which overflows and > > >>>> then > > >>>> stucks at 0xff or 0x00? > > >>> It's a kind of stack overflow. Kind of, because it's not the normal > > >>> thread stack, but a special signal stack in the _cygtls area. > > >>> > > >>> When interrupting a running thread to call a signal handler, the context > > >>> of the thread is changed to restart execution in an assembler function > > >>> called sigdelayed(). The original IP of the thread is pushed on the > > >>> aforementioned signal stack. Sigdelayed() calls the signal handler. On > > >>> return it pops the original IP from the signal stack and continues the > > >>> thread. > > >>> > > >>> Now guess what happens if the signal handler bails out with longjmp or > > >>> setcontext/swapcontext. > > >>> > > >>> The signal handler never returns to the sigdelayed() function, the > > >>> original address is never poped from the signal stack, and the signal > > >>> stack has a max. size of 256 address entries... > > >>> > > >>> Theoretically, a small update to sigdelayed() would fix the issue: ather > > >>> then poing the original IP from the signal stack after calling the > > >>> handler, it should pop the IP prior to calling the handler. That would > > >>> avoid filling up the signal stack when long-jumping out of the signal > > >>> handler. It should store the IP in one of the callee-saved registers. > > >>> %r13 is unused in sigdelayed so far. > > >>> > > >>> However, even if we do this, there's still the problem that sigdelayed() > > >>> itself takes space on the stack. If you longjmp/setcontext out of the > > >>> handler, the thread's normal stack will fill up with dead storage of the > > >>> sigdelayed() function, and there's no way out of this trap. We can't > > >>> restore the stack before the handler returns. > > >>> > > >>> So either way, at one point you get a stack overflow one way or the > > >>> other. > > >>> > > >>> The signal stack overflow is actually rather harmless in comparison > > >>> to a real stack overflow. > > >>> > > >>> If you have any idea how to avoid the real stack overflow, I'd be > > >>> all ears. > > >> Looks like this isn't really a problem with setcontext. It always > > >> corrects the stack pointer as well. Apparently I haven't thought > > >> long enough about this. > > >> > > >> I have a patch for sigdelayed() in the loop, stay tuned. > > > Just pushed. Try cygwin-3.6.0-0.430.ga942476236b5 in a bit. > > > > Problem does no longer occur. Also tested with 'kill -INT PID && sleep > > 0.01' in a loop. > > After the commit: > > commit a942476236b5e39bf30c533d08df7392e326a4c6 (origin/master, origin/main, > origin/HEAD) > Author: Corinna Vinschen <cori...@vinschen.de> > Date: Wed Mar 12 17:17:31 2025 +0100 > > Cygwin: sigdelayed: pop return address from signal stack earlier > > Christians test case: timersig.c no longer works even with my v3 patches. > I suspect it is because pop(), retaddr() are not working as intended in > call_signal_handler() with this commit. > > Could you please have a look?
What about following patch instead of your sigdelayed patch? diff --git a/winsup/cygwin/exceptions.cc b/winsup/cygwin/exceptions.cc index c9fe6a386..ceb47e52e 100644 --- a/winsup/cygwin/exceptions.cc +++ b/winsup/cygwin/exceptions.cc @@ -1758,6 +1758,13 @@ _cygtls::call_signal_handler () reset_signal_arrived (); incyg = false; current_sig = 0; /* Flag that we can accept another signal */ + + /* We have to fetch the original return address from the signal stack + prior to calling the signal handler. This avoids filling up the + signal stack if the signal handler longjumps (longjmp/setcontext). */ + DWORD64 retaddr1 = pop (); + DWORD64 retaddr2 = stackptr > stack ? retaddr () : 0; + __tlsstack_t *ptr = stackptr; unlock (); /* unlock signal stack */ /* Alternate signal stack requested for this signal and alternate signal @@ -1834,6 +1841,26 @@ _cygtls::call_signal_handler () signal handler. */ thisfunc (thissig, &thissi, thiscontext); + lock (); + if (stackptr == ptr) + push (retaddr1); + else if (stackptr == ptr + 1) + { + DWORD64 retaddr3 = pop(); + push (retaddr1); + push (retaddr3); + } + else if (stackptr == ptr - 1) + { + if (retaddr2) + push (retaddr2); + else + stackptr++; + } + else + api_fatal ("Signal stack corrupted?."); + unlock (); + incyg = true; set_signal_mask (_my_tls.sigmask, (this_sa_flags & SA_SIGINFO) -- Takashi Yano <takashi.y...@nifty.ne.jp> -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple