In commit 97550c0, I added a "MyProcPid == getpid()" check in the SIGTERM handler for the startup process. This ensures that processes forked by system(3) (i.e., for restore_command) that have yet to install their own signal handlers do not call proc_exit() upon receiving SIGTERM. Without this protection, both the startup process and the restore_command process might try to remove themselves from the PGPROC shared array (among other things), which can end badly.
Since then, I've been exploring a more general approach that would offer protection against similar issues in the future. We probably don't want signal handlers in these grandchild processes to touch shared memory at all. The attached 0001 is an attempt at adding such protection for all handlers installed via pqsignal(). In short, it stores the actual handler functions in a separate array, and sigaction() is given a wrapper function that performs the "MyProcPid == getpid()" check. If that check fails, the wrapper function installs the default signal handler and calls it. Besides allowing us to revert commit 97550c0 (see attached 0003), this wrapper handler could also restore errno, as shown in 0002. Right now, individual signal handlers must do this today as needed, but that seems easy to miss and prone to going unnoticed for a long time. I see two main downsides of this proposal: * Overhead: The wrapper handler calls a function pointer and getpid(), which AFAICT is a real system call on most platforms. That might not be a tremendous amount of overhead, but it's not zero, either. I'm particularly worried about signal-heavy code like synchronous replication. (Are there other areas that should be tested?) If this is a concern, perhaps we could allow certain processes to opt out of this wrapper handler, provided we believe it is unlikely to fork or that the handler code is safe to run in grandchild processes. * Race conditions: With these patches, pqsignal() becomes quite racy when used within signal handlers. Specifically, you might get a bogus return value. However, there are no in-tree callers of pqsignal() that look at the return value (and I don't see any reason they should), and it seems unlikely that pqsignal() will be used within signal handlers frequently, so this might not be a deal-breaker. I did consider trying to convert pqsignal() into a void function, but IIUC that would require an SONAME bump. For now, I've just documented the bogosity of the return values. Thoughts? -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
>From 5f46dc0150aba3ef75053e91f7c9a4d6624e2c4f Mon Sep 17 00:00:00 2001 From: Nathan Bossart <nat...@postgresql.org> Date: Fri, 17 Nov 2023 21:38:18 -0600 Subject: [PATCH v1 1/3] Check that MyProcPid == getpid() in all signal handlers. In commit 97550c0711, we added a similar check to the SIGTERM handler for the startup process. This commit adds this check to all signal handlers installed with pqsignal(). This is done by using a wrapper function that performs the check before calling the actual handler. The hope is that this will offer more general protection against grandchildren processes inadvertently modifying shared memory due to inherited signal handlers. Another potential follow-up improvement is to use this wrapper handler function to restore errno instead of relying on each individual handler function to do so. This commit makes the changes in commit 97550c0711 obsolete but leaves reverting it for a follow-up commit. Reviewed-by: ??? Discussion: ??? --- src/port/pqsignal.c | 72 +++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 70 insertions(+), 2 deletions(-) diff --git a/src/port/pqsignal.c b/src/port/pqsignal.c index 83d876db6c..978877dec6 100644 --- a/src/port/pqsignal.c +++ b/src/port/pqsignal.c @@ -28,23 +28,85 @@ #include "c.h" #include <signal.h> +#ifndef FRONTEND +#include <unistd.h> +#endif #ifndef FRONTEND #include "libpq/pqsignal.h" +#include "miscadmin.h" +#endif + +#ifdef NSIG +#define PG_NSIG (NSIG) +#else +#define PG_NSIG (64) /* XXX: wild guess */ +#endif + +static volatile pqsigfunc pqsignal_handlers[PG_NSIG]; + +/* + * Except when called with SIG_IGN or SIG_DFL, pqsignal() sets up this function + * as the handler for all signals. This wrapper handler function checks that + * it is called within a process that the server knows about, and not a + * grandchild process forked by system(3), etc. This check ensures that such + * grandchildren processes do not modify shared memory, which could be + * detrimental. If the check succeeds, the function originally provided to + * pqsignal() is called. Otherwise, the default signal handler is installed + * and then called. + */ +static void +wrapper_handler(SIGNAL_ARGS) +{ +#ifndef FRONTEND + Assert(MyProcPid); + + if (unlikely(MyProcPid != (int) getpid())) + { + pqsignal(postgres_signal_arg, SIG_DFL); + raise(postgres_signal_arg); + return; + } #endif + (*pqsignal_handlers[postgres_signal_arg]) (postgres_signal_arg); +} + /* * Set up a signal handler, with SA_RESTART, for signal "signo" * * Returns the previous handler. + * + * NB: If called within a signal handler, race conditions may lead to bogus + * return values. You should either avoid calling this within signal handlers + * or ignore the return value. + * + * XXX: Since no in-tree callers use the return value, and there is little + * reason to do so, it would be nice if we could convert this to a void + * function instead of providing potentially-bogus return values. + * Unfortunately, that requires modifying the pqsignal() in legacy-pqsignal.c, + * which in turn requires an SONAME bump, which is probably not worth it. */ pqsigfunc pqsignal(int signo, pqsigfunc func) { + pqsigfunc orig_func = pqsignal_handlers[signo]; /* assumed atomic */ #if !(defined(WIN32) && defined(FRONTEND)) struct sigaction act, oact; +#else + pqsigfunc ret; +#endif + + Assert(signo < PG_NSIG); + if (func != SIG_IGN && func != SIG_DFL) + { + pqsignal_handlers[signo] = func; /* assumed atomic */ + func = wrapper_handler; + } + +#if !(defined(WIN32) && defined(FRONTEND)) act.sa_handler = func; sigemptyset(&act.sa_mask); act.sa_flags = SA_RESTART; @@ -54,9 +116,15 @@ pqsignal(int signo, pqsigfunc func) #endif if (sigaction(signo, &act, &oact) < 0) return SIG_ERR; - return oact.sa_handler; + else if (oact.sa_handler == wrapper_handler) + return orig_func; + else + return oact.sa_handler; #else /* Forward to Windows native signal system. */ - return signal(signo, func); + if ((ret = signal(signo, func)) == wrapper_handler) + return orig_func; + else + return ret; #endif } -- 2.25.1
>From 19442fddd720df201590571b77f5c286218aaa77 Mon Sep 17 00:00:00 2001 From: Nathan Bossart <nat...@postgresql.org> Date: Fri, 17 Nov 2023 14:00:12 -0600 Subject: [PATCH v1 2/3] Centralize logic for restoring errno in signal handlers. Presently, we rely on each individual signal handler to save the initial value of errno and then restore it before returning if needed. This is easily forgotten and, if missed, often goes undetected for a long time. In commit XXXXXXXXXX, we introduced a wrapper signal handler function that checks whether MyProcPid matches getpid(). This commit moves the aforementioned errno restoration code from the individual signal handlers to the new wrapper handler so that we no longer need to worry about missing it. Reviewed-by: ??? Discussion: ??? --- src/backend/postmaster/autovacuum.c | 4 ---- src/backend/postmaster/checkpointer.c | 4 ---- src/backend/postmaster/interrupt.c | 8 -------- src/backend/postmaster/pgarch.c | 4 ---- src/backend/postmaster/postmaster.c | 16 ---------------- src/backend/postmaster/startup.c | 12 ------------ src/backend/postmaster/syslogger.c | 4 ---- src/backend/replication/walsender.c | 4 ---- src/backend/storage/ipc/latch.c | 4 ---- src/backend/storage/ipc/procsignal.c | 4 ---- src/backend/tcop/postgres.c | 8 -------- src/backend/utils/misc/timeout.c | 4 ---- src/fe_utils/cancel.c | 3 --- src/port/pqsignal.c | 6 ++++++ 14 files changed, 6 insertions(+), 79 deletions(-) diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c index 86a3b3d8be..ced7f72f63 100644 --- a/src/backend/postmaster/autovacuum.c +++ b/src/backend/postmaster/autovacuum.c @@ -1423,12 +1423,8 @@ AutoVacWorkerFailed(void) static void avl_sigusr2_handler(SIGNAL_ARGS) { - int save_errno = errno; - got_SIGUSR2 = true; SetLatch(MyLatch); - - errno = save_errno; } diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c index 42c807d352..6d8c1b7ce4 100644 --- a/src/backend/postmaster/checkpointer.c +++ b/src/backend/postmaster/checkpointer.c @@ -834,15 +834,11 @@ IsCheckpointOnSchedule(double progress) static void ReqCheckpointHandler(SIGNAL_ARGS) { - int save_errno = errno; - /* * The signaling process should have set ckpt_flags nonzero, so all we * need do is ensure that our main loop gets kicked out of any wait. */ SetLatch(MyLatch); - - errno = save_errno; } diff --git a/src/backend/postmaster/interrupt.c b/src/backend/postmaster/interrupt.c index 6d4bd76bf8..fa7833f673 100644 --- a/src/backend/postmaster/interrupt.c +++ b/src/backend/postmaster/interrupt.c @@ -60,12 +60,8 @@ HandleMainLoopInterrupts(void) void SignalHandlerForConfigReload(SIGNAL_ARGS) { - int save_errno = errno; - ConfigReloadPending = true; SetLatch(MyLatch); - - errno = save_errno; } /* @@ -108,10 +104,6 @@ SignalHandlerForCrashExit(SIGNAL_ARGS) void SignalHandlerForShutdownRequest(SIGNAL_ARGS) { - int save_errno = errno; - ShutdownRequestPending = true; SetLatch(MyLatch); - - errno = save_errno; } diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c index 46af349564..ff0958fa52 100644 --- a/src/backend/postmaster/pgarch.c +++ b/src/backend/postmaster/pgarch.c @@ -283,13 +283,9 @@ PgArchWakeup(void) static void pgarch_waken_stop(SIGNAL_ARGS) { - int save_errno = errno; - /* set flag to do a final cycle and shut down afterwards */ ready_to_stop = true; SetLatch(MyLatch); - - errno = save_errno; } /* diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c index 7b6b613c4a..208605569d 100644 --- a/src/backend/postmaster/postmaster.c +++ b/src/backend/postmaster/postmaster.c @@ -2608,12 +2608,8 @@ InitProcessGlobals(void) static void handle_pm_pmsignal_signal(SIGNAL_ARGS) { - int save_errno = errno; - pending_pm_pmsignal = true; SetLatch(MyLatch); - - errno = save_errno; } /* @@ -2622,12 +2618,8 @@ handle_pm_pmsignal_signal(SIGNAL_ARGS) static void handle_pm_reload_request_signal(SIGNAL_ARGS) { - int save_errno = errno; - pending_pm_reload_request = true; SetLatch(MyLatch); - - errno = save_errno; } /* @@ -2705,8 +2697,6 @@ process_pm_reload_request(void) static void handle_pm_shutdown_request_signal(SIGNAL_ARGS) { - int save_errno = errno; - switch (postgres_signal_arg) { case SIGTERM: @@ -2723,8 +2713,6 @@ handle_pm_shutdown_request_signal(SIGNAL_ARGS) break; } SetLatch(MyLatch); - - errno = save_errno; } /* @@ -2884,12 +2872,8 @@ process_pm_shutdown_request(void) static void handle_pm_child_exit_signal(SIGNAL_ARGS) { - int save_errno = errno; - pending_pm_child_exit = true; SetLatch(MyLatch); - - errno = save_errno; } /* diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c index 0e7de26bc2..83dbed86b9 100644 --- a/src/backend/postmaster/startup.c +++ b/src/backend/postmaster/startup.c @@ -96,32 +96,22 @@ static void StartupProcExit(int code, Datum arg); static void StartupProcTriggerHandler(SIGNAL_ARGS) { - int save_errno = errno; - promote_signaled = true; WakeupRecovery(); - - errno = save_errno; } /* SIGHUP: set flag to re-read config file at next convenient time */ static void StartupProcSigHupHandler(SIGNAL_ARGS) { - int save_errno = errno; - got_SIGHUP = true; WakeupRecovery(); - - errno = save_errno; } /* SIGTERM: set flag to abort redo and exit */ static void StartupProcShutdownHandler(SIGNAL_ARGS) { - int save_errno = errno; - if (in_restore_command) { /* @@ -140,8 +130,6 @@ StartupProcShutdownHandler(SIGNAL_ARGS) else shutdown_requested = true; WakeupRecovery(); - - errno = save_errno; } /* diff --git a/src/backend/postmaster/syslogger.c b/src/backend/postmaster/syslogger.c index 858a2f6b2b..2182e9073f 100644 --- a/src/backend/postmaster/syslogger.c +++ b/src/backend/postmaster/syslogger.c @@ -1642,10 +1642,6 @@ RemoveLogrotateSignalFiles(void) static void sigUsr1Handler(SIGNAL_ARGS) { - int save_errno = errno; - rotation_requested = true; SetLatch(MyLatch); - - errno = save_errno; } diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index 3bc9c82389..a809c2202b 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -3245,12 +3245,8 @@ HandleWalSndInitStopping(void) static void WalSndLastCycleHandler(SIGNAL_ARGS) { - int save_errno = errno; - got_SIGUSR2 = true; SetLatch(MyLatch); - - errno = save_errno; } /* Set up signal handlers */ diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c index 2fd386a4ed..3f52c528c4 100644 --- a/src/backend/storage/ipc/latch.c +++ b/src/backend/storage/ipc/latch.c @@ -2199,12 +2199,8 @@ GetNumRegisteredWaitEvents(WaitEventSet *set) static void latch_sigurg_handler(SIGNAL_ARGS) { - int save_errno = errno; - if (waiting) sendSelfPipeByte(); - - errno = save_errno; } /* Send one byte to the self-pipe, to wake up WaitLatch */ diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c index b7427906de..8e65009a89 100644 --- a/src/backend/storage/ipc/procsignal.c +++ b/src/backend/storage/ipc/procsignal.c @@ -638,8 +638,6 @@ CheckProcSignal(ProcSignalReason reason) void procsignal_sigusr1_handler(SIGNAL_ARGS) { - int save_errno = errno; - if (CheckProcSignal(PROCSIG_CATCHUP_INTERRUPT)) HandleCatchupInterrupt(); @@ -683,6 +681,4 @@ procsignal_sigusr1_handler(SIGNAL_ARGS) HandleRecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN); SetLatch(MyLatch); - - errno = save_errno; } diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index e415cf1f34..4bfa3fc009 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -2970,8 +2970,6 @@ quickdie(SIGNAL_ARGS) void die(SIGNAL_ARGS) { - int save_errno = errno; - /* Don't joggle the elbow of proc_exit */ if (!proc_exit_inprogress) { @@ -2993,8 +2991,6 @@ die(SIGNAL_ARGS) */ if (DoingCommandRead && whereToSendOutput != DestRemote) ProcessInterrupts(); - - errno = save_errno; } /* @@ -3004,8 +3000,6 @@ die(SIGNAL_ARGS) void StatementCancelHandler(SIGNAL_ARGS) { - int save_errno = errno; - /* * Don't joggle the elbow of proc_exit */ @@ -3017,8 +3011,6 @@ StatementCancelHandler(SIGNAL_ARGS) /* If we're still here, waken anything waiting on the process latch */ SetLatch(MyLatch); - - errno = save_errno; } /* signal handler for floating point exception */ diff --git a/src/backend/utils/misc/timeout.c b/src/backend/utils/misc/timeout.c index 8ab755d363..c9790f8cdb 100644 --- a/src/backend/utils/misc/timeout.c +++ b/src/backend/utils/misc/timeout.c @@ -363,8 +363,6 @@ schedule_alarm(TimestampTz now) static void handle_sig_alarm(SIGNAL_ARGS) { - int save_errno = errno; - /* * Bump the holdoff counter, to make sure nothing we call will process * interrupts directly. No timeout handler should do that, but these @@ -452,8 +450,6 @@ handle_sig_alarm(SIGNAL_ARGS) } RESUME_INTERRUPTS(); - - errno = save_errno; } diff --git a/src/fe_utils/cancel.c b/src/fe_utils/cancel.c index 10c0cd4554..bffef24d63 100644 --- a/src/fe_utils/cancel.c +++ b/src/fe_utils/cancel.c @@ -152,7 +152,6 @@ ResetCancelConn(void) static void handle_sigint(SIGNAL_ARGS) { - int save_errno = errno; char errbuf[256]; CancelRequested = true; @@ -173,8 +172,6 @@ handle_sigint(SIGNAL_ARGS) write_stderr(errbuf); } } - - errno = save_errno; /* just in case the write changed it */ } /* diff --git a/src/port/pqsignal.c b/src/port/pqsignal.c index 978877dec6..7fa50400c9 100644 --- a/src/port/pqsignal.c +++ b/src/port/pqsignal.c @@ -54,10 +54,14 @@ static volatile pqsigfunc pqsignal_handlers[PG_NSIG]; * detrimental. If the check succeeds, the function originally provided to * pqsignal() is called. Otherwise, the default signal handler is installed * and then called. + * + * This wrapper also handles restoring the value of errno. */ static void wrapper_handler(SIGNAL_ARGS) { + int save_errno = errno; + #ifndef FRONTEND Assert(MyProcPid); @@ -70,6 +74,8 @@ wrapper_handler(SIGNAL_ARGS) #endif (*pqsignal_handlers[postgres_signal_arg]) (postgres_signal_arg); + + errno = save_errno; } /* -- 2.25.1
>From 0858c2eed348bf4c2d1ae0c7131cafc438afb461 Mon Sep 17 00:00:00 2001 From: Nathan Bossart <nat...@postgresql.org> Date: Fri, 17 Nov 2023 22:09:24 -0600 Subject: [PATCH v1 3/3] Revert "Avoid calling proc_exit() in processes forked by system()." Thanks to commit XXXXXXXXXX, this check in the SIGTERM handler for the startup process is now obsolete and can be removed. Instead of leaving around the elog(PANIC, ...) calls that are now unlikely to be triggered and the dead function write_stderr_signal_safe(), I've opted to just remove them for now. Thanks to modern version control software, it will be trivial to dig those up if they are ever needed in the future. This reverts commit 97550c0711972a9856b5db751539bbaf2f88884c. Reviewed-by: ??? Discussion: ??? --- src/backend/postmaster/startup.c | 17 +---------------- src/backend/storage/ipc/ipc.c | 4 ---- src/backend/storage/lmgr/proc.c | 8 -------- src/backend/utils/error/elog.c | 28 ---------------------------- src/include/utils/elog.h | 6 ------ 5 files changed, 1 insertion(+), 62 deletions(-) diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c index 83dbed86b9..f40acd20ff 100644 --- a/src/backend/postmaster/startup.c +++ b/src/backend/postmaster/startup.c @@ -19,8 +19,6 @@ */ #include "postgres.h" -#include <unistd.h> - #include "access/xlog.h" #include "access/xlogrecovery.h" #include "access/xlogutils.h" @@ -113,20 +111,7 @@ static void StartupProcShutdownHandler(SIGNAL_ARGS) { if (in_restore_command) - { - /* - * If we are in a child process (e.g., forked by system() in - * RestoreArchivedFile()), we don't want to call any exit callbacks. - * The parent will take care of that. - */ - if (MyProcPid == (int) getpid()) - proc_exit(1); - else - { - write_stderr_signal_safe("StartupProcShutdownHandler() called in child process\n"); - _exit(1); - } - } + proc_exit(1); else shutdown_requested = true; WakeupRecovery(); diff --git a/src/backend/storage/ipc/ipc.c b/src/backend/storage/ipc/ipc.c index 6591b5d6a8..1904d21795 100644 --- a/src/backend/storage/ipc/ipc.c +++ b/src/backend/storage/ipc/ipc.c @@ -103,10 +103,6 @@ static int on_proc_exit_index, void proc_exit(int code) { - /* not safe if forked by system(), etc. */ - if (MyProcPid != (int) getpid()) - elog(PANIC, "proc_exit() called in child process"); - /* Clean up everything that must be cleaned up */ proc_exit_prepare(code); diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c index e9e445bb21..5b663a2997 100644 --- a/src/backend/storage/lmgr/proc.c +++ b/src/backend/storage/lmgr/proc.c @@ -806,10 +806,6 @@ ProcKill(int code, Datum arg) Assert(MyProc != NULL); - /* not safe if forked by system(), etc. */ - if (MyProc->pid != (int) getpid()) - elog(PANIC, "ProcKill() called in child process"); - /* Make sure we're out of the sync rep lists */ SyncRepCleanupAtProcExit(); @@ -930,10 +926,6 @@ AuxiliaryProcKill(int code, Datum arg) Assert(proctype >= 0 && proctype < NUM_AUXILIARY_PROCS); - /* not safe if forked by system(), etc. */ - if (MyProc->pid != (int) getpid()) - elog(PANIC, "AuxiliaryProcKill() called in child process"); - auxproc = &AuxiliaryProcs[proctype]; Assert(MyProc == auxproc); diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c index 6aeb855e49..ba9179da85 100644 --- a/src/backend/utils/error/elog.c +++ b/src/backend/utils/error/elog.c @@ -3733,34 +3733,6 @@ write_stderr(const char *fmt,...) } -/* - * Write a message to STDERR using only async-signal-safe functions. This can - * be used to safely emit a message from a signal handler. - * - * TODO: It is likely possible to safely do a limited amount of string - * interpolation (e.g., %s and %d), but that is not presently supported. - */ -void -write_stderr_signal_safe(const char *str) -{ - int nwritten = 0; - int ntotal = strlen(str); - - while (nwritten < ntotal) - { - int rc; - - rc = write(STDERR_FILENO, str + nwritten, ntotal - nwritten); - - /* Just give up on error. There isn't much else we can do. */ - if (rc == -1) - return; - - nwritten += rc; - } -} - - /* * Adjust the level of a recovery-related message per trace_recovery_messages. * diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h index 0971d5ce33..b25ea65dff 100644 --- a/src/include/utils/elog.h +++ b/src/include/utils/elog.h @@ -536,10 +536,4 @@ extern void write_jsonlog(ErrorData *edata); */ extern void write_stderr(const char *fmt,...) pg_attribute_printf(1, 2); -/* - * Write a message to STDERR using only async-signal-safe functions. This can - * be used to safely emit a message from a signal handler. - */ -extern void write_stderr_signal_safe(const char *fmt); - #endif /* ELOG_H */ -- 2.25.1