On Sat, Feb 25, 2023 at 11:52:53AM -0800, Andres Freund wrote: > I think I opined on this before, but we really ought to have a function to do > some minimal signal safe output. Implemented centrally, instead of being open > coded in a bunch of places.
While looking around for the right place to put this, I noticed that there's a write_stderr() function in elog.c that we might be able to use. I used that in v9. WDYT? -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
>From 7dcf8fe947ab7b6e0b37ddd42a08bbbc560d4a37 Mon Sep 17 00:00:00 2001 From: Nathan Bossart <nathandboss...@gmail.com> Date: Thu, 23 Feb 2023 14:27:48 -0800 Subject: [PATCH v9 1/2] Move extra code out of the Pre/PostRestoreCommand() block. If SIGTERM is received within this block, the startup process will immediately proc_exit() in the signal handler, so it is inadvisable to include any more code than is required in this section. This change moves the code recently added to this block (see 1b06d7b and 7fed801) to outside of the block. This ensures that only system() is called while proc_exit() might be called in the SIGTERM handler, which is how this code worked from v8.4 to v14. --- src/backend/access/transam/xlogarchive.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/src/backend/access/transam/xlogarchive.c b/src/backend/access/transam/xlogarchive.c index fcc87ff44f..41684418b6 100644 --- a/src/backend/access/transam/xlogarchive.c +++ b/src/backend/access/transam/xlogarchive.c @@ -159,20 +159,27 @@ RestoreArchivedFile(char *path, const char *xlogfname, (errmsg_internal("executing restore command \"%s\"", xlogRestoreCmd))); + fflush(NULL); + pgstat_report_wait_start(WAIT_EVENT_RESTORE_COMMAND); + /* - * Check signals before restore command and reset afterwards. + * PreRestoreCommand() informs the SIGTERM handler for the startup process + * that it should proc_exit() right away. This is done for the duration of + * the system() call because there isn't a good way to break out while it + * is executing. Since we might call proc_exit() in a signal handler, it + * is best to put any additional logic before or after the + * PreRestoreCommand()/PostRestoreCommand() section. */ PreRestoreCommand(); /* * Copy xlog from archival storage to XLOGDIR */ - fflush(NULL); - pgstat_report_wait_start(WAIT_EVENT_RESTORE_COMMAND); rc = system(xlogRestoreCmd); - pgstat_report_wait_end(); PostRestoreCommand(); + + pgstat_report_wait_end(); pfree(xlogRestoreCmd); if (rc == 0) -- 2.25.1
>From e41a13a77b1c6cf156387cee3ca49fb2f4b1253b Mon Sep 17 00:00:00 2001 From: Nathan Bossart <nathandboss...@gmail.com> Date: Tue, 14 Feb 2023 09:44:53 -0800 Subject: [PATCH v9 2/2] Don't proc_exit() in startup's SIGTERM handler if forked by system(). Instead, emit a message to STDERR and _exit() in this case. This change also adds assertions to proc_exit(), ProcKill(), and AuxiliaryProcKill() to verify that these functions are not called by a process forked by system(), etc. --- src/backend/postmaster/startup.c | 17 ++++++++++++++++- src/backend/storage/ipc/ipc.c | 3 +++ src/backend/storage/lmgr/proc.c | 2 ++ 3 files changed, 21 insertions(+), 1 deletion(-) diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c index efc2580536..261f992357 100644 --- a/src/backend/postmaster/startup.c +++ b/src/backend/postmaster/startup.c @@ -19,6 +19,8 @@ */ #include "postgres.h" +#include <unistd.h> + #include "access/xlog.h" #include "access/xlogrecovery.h" #include "access/xlogutils.h" @@ -121,7 +123,20 @@ StartupProcShutdownHandler(SIGNAL_ARGS) int save_errno = errno; if (in_restore_command) - proc_exit(1); + { + /* + * If we are in a child process (e.g., forked by system() in + * RestoreArchivedFile()), we don't want to call any exit callbacks. + * The parent will take care of that. + */ + if (MyProcPid == (int) getpid()) + proc_exit(1); + else + { + write_stderr("StartupProcShutdownHandler() called in child process\n"); + _exit(1); + } + } else shutdown_requested = true; WakeupRecovery(); diff --git a/src/backend/storage/ipc/ipc.c b/src/backend/storage/ipc/ipc.c index 1904d21795..d5097dc008 100644 --- a/src/backend/storage/ipc/ipc.c +++ b/src/backend/storage/ipc/ipc.c @@ -103,6 +103,9 @@ static int on_proc_exit_index, void proc_exit(int code) { + /* proc_exit() is not safe in forked processes from system(), etc. */ + Assert(MyProcPid == (int) getpid()); + /* Clean up everything that must be cleaned up */ proc_exit_prepare(code); diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c index 22b4278610..b9e2c3aafe 100644 --- a/src/backend/storage/lmgr/proc.c +++ b/src/backend/storage/lmgr/proc.c @@ -805,6 +805,7 @@ ProcKill(int code, Datum arg) dlist_head *procgloballist; Assert(MyProc != NULL); + Assert(MyProc->pid == (int) getpid()); /* not safe if forked by system(), etc. */ /* Make sure we're out of the sync rep lists */ SyncRepCleanupAtProcExit(); @@ -925,6 +926,7 @@ AuxiliaryProcKill(int code, Datum arg) PGPROC *proc; Assert(proctype >= 0 && proctype < NUM_AUXILIARY_PROCS); + Assert(MyProc->pid == (int) getpid()); /* not safe if forked by system(), etc. */ auxproc = &AuxiliaryProcs[proctype]; -- 2.25.1