Roland Mainz via Cygwin wrote:
On Fri, Mar 7, 2025 at 9:01 AM Takashi Yano via Cygwin
<cygwin@cygwin.com> wrote:
On Fri, 7 Mar 2025 16:29:51 +0900
Takashi Yano wrote:
On Wed, 5 Mar 2025 11:23:26 +0100
Christian Franke wrote:
...
Unfortunately signals may be lost, a new testcase is attached:

...

$ ./lostsig
1163: fork()=1164
SIGALRM x 10
SIGSTOP
SIGTERM
SIGCONT
waitpid()...
[ALRM]
[TERM]
...hangs...


A 'ps' is a second terminal then shows that the child process is still
in S)topped state. 'kill -CONT ...' works to continue.

If the testcase is assigned to a single core with 'taskset 0x1 ...', it
apparently always hangs.
Thanks for the report and the testcase.
The current implementation of the signal queue has the following problems:
1) Signals in the queue are processed in a disordered manner.
2) If the same signal is already in the queue, new signal is discarded.

I am working on this issue and almost finished.

Now I'm testing. Please wait a while.

Thanks for working on this complex topic!


BTW, the resut of your testcase in Linux is as follows:

231873: fork()=231874
SIGALRM x 10
[ALRM]
[ALRM]
[ALRM]
SIGSTOP
SIGTERM
SIGCONT
waitpid()...
[TERM]
231874: 3 SIGALRM received, exit(42)
waidpid()=231874, status=0x2a00

Signal-lost also happens. However, it does not hang in Linux.

Sorry, I didn't clarify that the lost SIGALRMs are not the problem. These are issued solely to trigger the lost SIGCONT problem.

A test on Linux also shows that signal handlers may be called out of order:

1070: fork()=1071
SIGALRM x 10
SIGSTOP
SIGTERM
SIGCONT
waitpid()...
[TERM]
[ALRM]
[ALRM]
1071: 2 SIGALRM received, exit(42)
waidpid()=1071, status=0x2a00

Same if sigqueue() is used instead of kill().

BTW: If you do testing PLEASE use |sigqueue()| for |SIGRT*| signals
(and check the return code!) and NOOT |kill()|, because |kill()|
cannot communicate if there was no room left to queue another signal.

Thanks for clarification. If the testcase is modified to use sigqueue() instead of kill() and SIGRTMIN+0/1 instead of SIGALRM/TERM, sigqueue() also always succeeds. On Linux, no signal is lost then, but the disorder may still occur.

--
Regards,
Christian


--
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

Reply via email to