On 11/03/2024 16:22, Pádraig Brady wrote:
On 11/03/2024 15:17, Pádraig Brady wrote:
There was a narrow race condition reported at:
https://github.com/coreutils/coreutils/issues/82

Then I noticed another related race condition
in the case of failing fork().

Both should be addressed by the attached.

Actually the second one to ensure processes are killed
is only moving the race around, since the child may
receive the signal in the same narrow window before assigning
monitored_pid, and ignore it.

The attached avoids the issue by blocking signal handling
until monitored_pid is in a deterministic state wrt fork().

I'll think a bit more about this before applying.

The previous patch flip flops the signal mask a couple of times.
We can simplify by expanding the critical section a little
and thus only blocking the cleanup signals once,
and continue to use sigsuspend() to reenable signals in the parent.
I'll push the attached later.

cheers,
Pádraig
From f8da694d69067db0675f50d605bc7952be64c684 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= <p...@draigbrady.com>
Date: Mon, 11 Mar 2024 13:46:24 +0000
Subject: [PATCH] timeout: fix narrow race in failing to kill processes

* src/timeout.c (main): Block cleanup signals earlier so that cleanup()
is not runnable until monitored_pid is in a deterministic state.
This ensures we always send a termination signal to the child
once it's forked.
* NEWS: Mention the bug fix.
Reported at https://github.com/coreutils/coreutils/issues/82
---
 NEWS          |  4 ++++
 src/timeout.c | 32 +++++++++++++++++++++-----------
 2 files changed, 25 insertions(+), 11 deletions(-)

diff --git a/NEWS b/NEWS
index 864d7fe8b..20cadf183 100644
--- a/NEWS
+++ b/NEWS
@@ -37,6 +37,10 @@ GNU coreutils NEWS                                    -*- outline -*-
   processes after a failed process fork.
   [bug introduced with timeout in coreutils-7.0]
 
+  timeout avoids a narrow race condition, where it might fail to
+  kill monitored processes immediately after forking them.
+  [bug introduced with timeout in coreutils-7.0]
+
   wc no longer fails to count unprintable characters as parts of words.
   [bug introduced in textutils-2.1]
 
diff --git a/src/timeout.c b/src/timeout.c
index 9aa46a4f5..8da8d712d 100644
--- a/src/timeout.c
+++ b/src/timeout.c
@@ -248,7 +248,7 @@ cleanup (int sig)
     { /* were in the parent, so let it continue to exit below.  */
     }
   else /* monitored_pid == 0  */
-    { /* we're the child or the child is not exec'd yet.  */
+    { /* parent hasn't forked yet, or child has not exec'd yet.  */
       _exit (128 + sig);
     }
 }
@@ -537,14 +537,29 @@ main (int argc, char **argv)
   signal (SIGTTOU, SIG_IGN);   /* Don't stop if background child needs tty.  */
   install_sigchld ();          /* Interrupt sigsuspend() when child exits.   */
 
+  /* We configure timers so that SIGALRM is sent on expiry.
+     Therefore ensure we don't inherit a mask blocking SIGALRM.  */
+  unblock_signal (SIGALRM);
+
+  /* Block signals now, so monitored_pid is deterministic in cleanup().  */
+  sigset_t orig_set;
+  block_cleanup_and_chld (term_signal, &orig_set);
   monitored_pid = fork ();
+
   if (monitored_pid == -1)
     {
       error (0, errno, _("fork system call failed"));
       return EXIT_CANCELED;
     }
-  else if (monitored_pid == 0)
-    {                           /* child */
+  else if (monitored_pid == 0)  /* child */
+    {
+      /* Restore signal mask for child.  */
+      if (sigprocmask (SIG_SETMASK, &orig_set, nullptr) != 0)
+        {
+          error (0, errno, _("child failed to reset signal mask"));
+          return EXIT_CANCELED;
+        }
+
       /* exec doesn't reset SIG_IGN -> SIG_DFL.  */
       signal (SIGTTIN, SIG_DFL);
       signal (SIGTTOU, SIG_DFL);
@@ -561,19 +576,14 @@ main (int argc, char **argv)
       pid_t wait_result;
       int status;
 
-      /* We configure timers so that SIGALRM is sent on expiry.
-         Therefore ensure we don't inherit a mask blocking SIGALRM.  */
-      unblock_signal (SIGALRM);
-
       settimeout (timeout, true);
 
-      /* Ensure we don't cleanup() after waitpid() reaps the child,
+      /* Note signals remain blocked in parent here, to ensure
+         we don't cleanup() after waitpid() reaps the child,
          to avoid sending signals to a possibly different process.  */
-      sigset_t cleanup_set;
-      block_cleanup_and_chld (term_signal, &cleanup_set);
 
       while ((wait_result = waitpid (monitored_pid, &status, WNOHANG)) == 0)
-        sigsuspend (&cleanup_set);  /* Wait with cleanup signals unblocked.  */
+        sigsuspend (&orig_set);  /* Wait with cleanup signals unblocked.  */
 
       if (wait_result < 0)
         {
-- 
2.43.0

Reply via email to