On Tue, Jan 24, 2023 at 10:25:11AM +0600, NRK wrote: > On Mon, Jan 23, 2023 at 08:05:19PM +0100, Hiltjo Posthuma wrote: > > Do you perhaps also have some simple way to reproduce where it causes > > issues in > > some real world use-case? > > > > Ideally some command or script to reproduce the issue. > > To trigger the issue, you need to have 3 conditions met: > > 1. Recieve a SIGCHLD. > 2. `waitpid` failing and spoiling the `errno`. > 3. Being unlucky that the resuming code gets affected by the `errno`. > > Here's a simple dummy program demonstrating the issue. Compile it, run > it and then send some SIGCHLDs to it: > > [/tmp]~> cat sigtest.c > #include <stdlib.h> > #include <signal.h> > #include <errno.h> > #include <stdio.h> > #include <sys/wait.h> > > void > sigchld(int unused) > { > if (signal(SIGCHLD, sigchld) == SIG_ERR) > _Exit(1); > while (0 < waitpid(-1, NULL, WNOHANG)); > } > > int main(void) > { > sigchld(0); > while (1) { > errno = 0; > char *p = malloc(8); > if (p != NULL && errno) { > perror("sigtest: "); > return 1; > } > free(p); > } > } > [/tmp]~> cc -o sigtest sigtest.c > [/tmp]~> ./sigtest & > [1] 9363 > [/tmp]~> while pidof sigtest >/dev/null; do kill -s SIGCHLD $(pidof > sigtest); done > sigtest: : No child processes > > If you're asking for `dwm` in specific - then trigger condition 2 is > difficult (impossible ??) since the window manager will typically have > some children. But if you insert the following, to simulate failure: > > while (0 < waitpid(-1, NULL, WNOHANG)); > + errno = ECHILD; > > then the issue is easily reproducable by sending some SIGCHLDs to dwm > (in quick succession in order to maximize chances of triggering > condition 3): > > var=$(pidof dwm); while :; do kill -s SIGCHLD $var; done > > And dwm will crash with a XIO error with in a couple seconds on my > system. > > - NRK >
Thanks, I can see how it can be an issue clobbering errno potentially etc. Although of course checking errno on a success condition is a bit wonky in this test case. What about a simplified version of a patch below, it should do a similar thing: >From d628a1081a2403e3b703b2f1acbff1f0ca148956 Mon Sep 17 00:00:00 2001 From: Hiltjo Posthuma <hil...@codemadness.org> Date: Tue, 24 Jan 2023 20:53:56 +0100 Subject: [PATCH] simplify ignoring SIGCHLD ... and handle in a way with less possible race-conditions. https://pubs.opengroup.org/onlinepubs/9699919799/functions/_Exit.html#tag_16_01_03_01 "[XSI] [Option Start] If the parent process of the calling process has set its SA_NOCLDWAIT flag or has set the action for the SIGCHLD signal to SIG_IGN: [...] The lifetime of the calling process shall end immediately. If SA_NOCLDWAIT is set, it is implementation-defined whether a SIGCHLD signal is sent to the parent process. [...]" --- dwm.c | 12 ++---------- 1 file changed, 2 insertions(+), 10 deletions(-) diff --git a/dwm.c b/dwm.c index 03baf42..b3c1fc6 100644 --- a/dwm.c +++ b/dwm.c @@ -205,7 +205,6 @@ static void setmfact(const Arg *arg); static void setup(void); static void seturgent(Client *c, int urg); static void showhide(Client *c); -static void sigchld(int unused); static void spawn(const Arg *arg); static void tag(const Arg *arg); static void tagmon(const Arg *arg); @@ -1545,7 +1544,8 @@ setup(void) Atom utf8string; /* clean up any zombies immediately */ - sigchld(0); + if (signal(SIGCHLD, SIG_IGN) == SIG_ERR) + die("can't ignore SIGCHLD:"); /* init screen */ screen = DefaultScreen(dpy); @@ -1638,14 +1638,6 @@ showhide(Client *c) } } -void -sigchld(int unused) -{ - if (signal(SIGCHLD, sigchld) == SIG_ERR) - die("can't install SIGCHLD handler:"); - while (0 < waitpid(-1, NULL, WNOHANG)); -} - void spawn(const Arg *arg) { -- 2.39.1 -- Kind regards, Hiltjo