On 2017-06-02 17:20:23 -0700, Andres Freund wrote: > Attached is a *preliminary* patch series implementing this. I've first > reverted the previous patch, as otherwise backpatchable versions of the > necessary patches would get too complicated, due to the signals used and > such.
I went again through this, and the only real thing I found that there was a leftover prototype in walsender.h. I've in interim worked on backpatch versions of that series, annoying conflicts, but nothing really problematic. The only real difference is adding SetLatch() calls to HandleWalSndInitStopping() < 9.6, and guarding SetLatch with an if < 9.5. As an additional patch (based on one by Petr), even though it more belongs to http://archives.postgresql.org/message-id/20170421014030.fdzvvvbrz4nckrow%40alap3.anarazel.de attached is a patch unifying SIGHUP between normal and walsender backends. This needs to be backpatched all the way. I've also attached a second patch, again based on Petr's, that unifies SIGHUP handling across all the remaining backends, but that's something that probably more appropriate for v11, although I'm still tempted to commit it earlier. Michael, Peter, Fujii, is either of you planning to review this? I'm planning to commit this tomorrow morning PST, unless somebody protest till then. - Andres
>From 39f95c9e85811d6759a29b293adc97567d895d69 Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Fri, 2 Jun 2017 14:14:34 -0700 Subject: [PATCH 1/6] Revert "Prevent panic during shutdown checkpoint" This reverts commit 086221cf6b1727c2baed4703c582f657b7c5350e, which was made to master only. The approach implemented in the above commit has some issues. While those could easily be fixed incrementally, doing so would make backpatching considerably harder, so instead first revert this patch. Discussion: https://postgr.es/m/20170602002912.tqlwn4gymzlxp...@alap3.anarazel.de --- doc/src/sgml/monitoring.sgml | 5 - src/backend/access/transam/xlog.c | 6 -- src/backend/postmaster/postmaster.c | 7 +- src/backend/replication/walsender.c | 143 ++++------------------------ src/include/replication/walsender.h | 1 - src/include/replication/walsender_private.h | 3 +- 6 files changed, 24 insertions(+), 141 deletions(-) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 79ca45a156..5640c0d84a 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -1690,11 +1690,6 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i <literal>backup</>: This WAL sender is sending a backup. </para> </listitem> - <listitem> - <para> - <literal>stopping</>: This WAL sender is stopping. - </para> - </listitem> </itemizedlist> </entry> </row> diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 399822d3fe..35ee7d1cb6 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -8324,12 +8324,6 @@ ShutdownXLOG(int code, Datum arg) ereport(IsPostmasterEnvironment ? LOG : NOTICE, (errmsg("shutting down"))); - /* - * Wait for WAL senders to be in stopping state. This prevents commands - * from writing new WAL. - */ - WalSndWaitStopping(); - if (RecoveryInProgress()) CreateRestartPoint(CHECKPOINT_IS_SHUTDOWN | CHECKPOINT_IMMEDIATE); else diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c index 35b4ec88d3..5c79b1e40d 100644 --- a/src/backend/postmaster/postmaster.c +++ b/src/backend/postmaster/postmaster.c @@ -2918,7 +2918,7 @@ reaper(SIGNAL_ARGS) * Waken walsenders for the last time. No regular backends * should be around anymore. */ - SignalChildren(SIGINT); + SignalChildren(SIGUSR2); pmState = PM_SHUTDOWN_2; @@ -3656,9 +3656,7 @@ PostmasterStateMachine(void) /* * If we get here, we are proceeding with normal shutdown. All * the regular children are gone, and it's time to tell the - * checkpointer to do a shutdown checkpoint. All WAL senders - * are told to switch to a stopping state so that the shutdown - * checkpoint can go ahead. + * checkpointer to do a shutdown checkpoint. */ Assert(Shutdown > NoShutdown); /* Start the checkpointer if not running */ @@ -3667,7 +3665,6 @@ PostmasterStateMachine(void) /* And tell it to shut down */ if (CheckpointerPID != 0) { - SignalSomeChildren(SIGUSR2, BACKEND_TYPE_WALSND); signal_child(CheckpointerPID, SIGUSR2); pmState = PM_SHUTDOWN; } diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index 49cce38880..aa705e5b35 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -24,14 +24,11 @@ * are treated as not a crash but approximately normal termination; * the walsender will exit quickly without sending any more XLOG records. * - * If the server is shut down, postmaster sends us SIGUSR2 after all regular - * backends have exited. This causes the walsender to switch to the "stopping" - * state. In this state, the walsender will reject any replication command - * that may generate WAL activity. The checkpointer begins the shutdown - * checkpoint once all walsenders are confirmed as stopping. When the shutdown - * checkpoint finishes, the postmaster sends us SIGINT. This instructs - * walsender to send any outstanding WAL, including the shutdown checkpoint - * record, wait for it to be replicated to the standby, and then exit. + * If the server is shut down, postmaster sends us SIGUSR2 after all + * regular backends have exited and the shutdown checkpoint has been written. + * This instructs walsender to send any outstanding WAL, including the + * shutdown checkpoint record, wait for it to be replicated to the standby, + * and then exit. * * * Portions Copyright (c) 2010-2017, PostgreSQL Global Development Group @@ -180,14 +177,13 @@ static bool WalSndCaughtUp = false; /* Flags set by signal handlers for later service in main loop */ static volatile sig_atomic_t got_SIGHUP = false; -static volatile sig_atomic_t got_SIGINT = false; -static volatile sig_atomic_t got_SIGUSR2 = false; +static volatile sig_atomic_t walsender_ready_to_stop = false; /* - * This is set while we are streaming. When not set, SIGINT signal will be + * This is set while we are streaming. When not set, SIGUSR2 signal will be * handled like SIGTERM. When set, the main loop is responsible for checking - * got_SIGINT and terminating when it's set (after streaming any remaining - * WAL). + * walsender_ready_to_stop and terminating when it's set (after streaming any + * remaining WAL). */ static volatile sig_atomic_t replication_active = false; @@ -217,7 +213,6 @@ static struct /* Signal handlers */ static void WalSndSigHupHandler(SIGNAL_ARGS); static void WalSndXLogSendHandler(SIGNAL_ARGS); -static void WalSndSwitchStopping(SIGNAL_ARGS); static void WalSndLastCycleHandler(SIGNAL_ARGS); /* Prototypes for private functions */ @@ -306,14 +301,11 @@ WalSndErrorCleanup(void) ReplicationSlotCleanup(); replication_active = false; - if (got_SIGINT) + if (walsender_ready_to_stop) proc_exit(0); /* Revert back to startup state */ WalSndSetState(WALSNDSTATE_STARTUP); - - if (got_SIGUSR2) - WalSndSetState(WALSNDSTATE_STOPPING); } /* @@ -686,7 +678,7 @@ StartReplication(StartReplicationCmd *cmd) WalSndLoop(XLogSendPhysical); replication_active = false; - if (got_SIGINT) + if (walsender_ready_to_stop) proc_exit(0); WalSndSetState(WALSNDSTATE_STARTUP); @@ -1064,7 +1056,7 @@ StartLogicalReplication(StartReplicationCmd *cmd) { ereport(LOG, (errmsg("terminating walsender process after promotion"))); - got_SIGINT = true; + walsender_ready_to_stop = true; } WalSndSetState(WALSNDSTATE_CATCHUP); @@ -1115,7 +1107,7 @@ StartLogicalReplication(StartReplicationCmd *cmd) ReplicationSlotRelease(); replication_active = false; - if (got_SIGINT) + if (walsender_ready_to_stop) proc_exit(0); WalSndSetState(WALSNDSTATE_STARTUP); @@ -1327,14 +1319,6 @@ WalSndWaitForWal(XLogRecPtr loc) RecentFlushPtr = GetXLogReplayRecPtr(NULL); /* - * If postmaster asked us to switch to the stopping state, do so. - * Shutdown is in progress and this will allow the checkpointer to - * move on with the shutdown checkpoint. - */ - if (got_SIGUSR2) - WalSndSetState(WALSNDSTATE_STOPPING); - - /* * If postmaster asked us to stop, don't wait here anymore. This will * cause the xlogreader to return without reading a full record, which * is the fastest way to reach the mainloop which then can quit. @@ -1343,7 +1327,7 @@ WalSndWaitForWal(XLogRecPtr loc) * RecentFlushPtr, so we can send all remaining data before shutting * down. */ - if (got_SIGINT) + if (walsender_ready_to_stop) break; /* @@ -1418,22 +1402,6 @@ exec_replication_command(const char *cmd_string) MemoryContext old_context; /* - * If WAL sender has been told that shutdown is getting close, switch its - * status accordingly to handle the next replication commands correctly. - */ - if (got_SIGUSR2) - WalSndSetState(WALSNDSTATE_STOPPING); - - /* - * Throw error if in stopping mode. We need prevent commands that could - * generate WAL while the shutdown checkpoint is being written. To be - * safe, we just prohibit all new commands. - */ - if (MyWalSnd->state == WALSNDSTATE_STOPPING) - ereport(ERROR, - (errmsg("cannot execute new commands while WAL sender is in stopping mode"))); - - /* * CREATE_REPLICATION_SLOT ... LOGICAL exports a snapshot until the next * command arrives. Clean up the old stuff if there's anything. */ @@ -2155,20 +2123,13 @@ WalSndLoop(WalSndSendDataCallback send_data) } /* - * At the reception of SIGUSR2, switch the WAL sender to the - * stopping state. - */ - if (got_SIGUSR2) - WalSndSetState(WALSNDSTATE_STOPPING); - - /* - * When SIGINT arrives, we send any outstanding logs up to the + * When SIGUSR2 arrives, we send any outstanding logs up to the * shutdown checkpoint record (i.e., the latest record), wait for * them to be replicated to the standby, and exit. This may be a * normal termination at shutdown, or a promotion, the walsender * is not sure which. */ - if (got_SIGINT) + if (walsender_ready_to_stop) WalSndDone(send_data); } @@ -2907,23 +2868,7 @@ WalSndXLogSendHandler(SIGNAL_ARGS) errno = save_errno; } -/* SIGUSR2: set flag to switch to stopping state */ -static void -WalSndSwitchStopping(SIGNAL_ARGS) -{ - int save_errno = errno; - - got_SIGUSR2 = true; - SetLatch(MyLatch); - - errno = save_errno; -} - -/* - * SIGINT: set flag to do a last cycle and shut down afterwards. The WAL - * sender should already have been switched to WALSNDSTATE_STOPPING at - * this point. - */ +/* SIGUSR2: set flag to do a last cycle and shut down afterwards */ static void WalSndLastCycleHandler(SIGNAL_ARGS) { @@ -2938,7 +2883,7 @@ WalSndLastCycleHandler(SIGNAL_ARGS) if (!replication_active) kill(MyProcPid, SIGTERM); - got_SIGINT = true; + walsender_ready_to_stop = true; SetLatch(MyLatch); errno = save_errno; @@ -2951,14 +2896,14 @@ WalSndSignals(void) /* Set up signal handlers */ pqsignal(SIGHUP, WalSndSigHupHandler); /* set flag to read config * file */ - pqsignal(SIGINT, WalSndLastCycleHandler); /* request a last cycle and - * shutdown */ + pqsignal(SIGINT, SIG_IGN); /* not used */ pqsignal(SIGTERM, die); /* request shutdown */ pqsignal(SIGQUIT, quickdie); /* hard crash time */ InitializeTimeouts(); /* establishes SIGALRM handler */ pqsignal(SIGPIPE, SIG_IGN); pqsignal(SIGUSR1, WalSndXLogSendHandler); /* request WAL sending */ - pqsignal(SIGUSR2, WalSndSwitchStopping); /* switch to stopping state */ + pqsignal(SIGUSR2, WalSndLastCycleHandler); /* request a last cycle and + * shutdown */ /* Reset some signals that are accepted by postmaster but not here */ pqsignal(SIGCHLD, SIG_DFL); @@ -3036,50 +2981,6 @@ WalSndWakeup(void) } } -/* - * Wait that all the WAL senders have reached the stopping state. This is - * used by the checkpointer to control when shutdown checkpoints can - * safely begin. - */ -void -WalSndWaitStopping(void) -{ - for (;;) - { - int i; - bool all_stopped = true; - - for (i = 0; i < max_wal_senders; i++) - { - WalSndState state; - WalSnd *walsnd = &WalSndCtl->walsnds[i]; - - SpinLockAcquire(&walsnd->mutex); - - if (walsnd->pid == 0) - { - SpinLockRelease(&walsnd->mutex); - continue; - } - - state = walsnd->state; - SpinLockRelease(&walsnd->mutex); - - if (state != WALSNDSTATE_STOPPING) - { - all_stopped = false; - break; - } - } - - /* safe to leave if confirmation is done for all WAL senders */ - if (all_stopped) - return; - - pg_usleep(10000L); /* wait for 10 msec */ - } -} - /* Set state for current walsender (only called in walsender) */ void WalSndSetState(WalSndState state) @@ -3113,8 +3014,6 @@ WalSndGetStateString(WalSndState state) return "catchup"; case WALSNDSTATE_STREAMING: return "streaming"; - case WALSNDSTATE_STOPPING: - return "stopping"; } return "UNKNOWN"; } diff --git a/src/include/replication/walsender.h b/src/include/replication/walsender.h index 99f12377e0..2ca903872e 100644 --- a/src/include/replication/walsender.h +++ b/src/include/replication/walsender.h @@ -44,7 +44,6 @@ extern void WalSndSignals(void); extern Size WalSndShmemSize(void); extern void WalSndShmemInit(void); extern void WalSndWakeup(void); -extern void WalSndWaitStopping(void); extern void WalSndRqstFileReload(void); /* diff --git a/src/include/replication/walsender_private.h b/src/include/replication/walsender_private.h index 36311e124c..2c59056cef 100644 --- a/src/include/replication/walsender_private.h +++ b/src/include/replication/walsender_private.h @@ -24,8 +24,7 @@ typedef enum WalSndState WALSNDSTATE_STARTUP = 0, WALSNDSTATE_BACKUP, WALSNDSTATE_CATCHUP, - WALSNDSTATE_STREAMING, - WALSNDSTATE_STOPPING + WALSNDSTATE_STREAMING } WalSndState; /* -- 2.12.0.264.gd6db3f2165.dirty
>From 7e252e3213eaadafcdcd451eaa08c6da7f8ef804 Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Fri, 2 Jun 2017 16:54:51 -0700 Subject: [PATCH 2/6] Have walsenders participate in procsignal infrastructure. The non-participation in procsignal was a problem for both changes in master, e.g. parallelism not working for normal statements run in walsender backends, and older branches, e.g. recovery conflicts and catchup interrupts not working for logical decoding walsenders. This commit thus replaces the previous WalSndXLogSendHandler with procsignal_sigusr1_handler. In branches since db0f6cad48 that can lead to additional SetLatch calls, but that only rarely seems to make a difference. Author: Andres Freund Discussion: https://postgr.es/m/20170421014030.fdzvvvbrz4nck...@alap3.anarazel.de Backpatch: 9.4, earlier commits don't seem to benefit sufficiently --- src/backend/replication/walsender.c | 14 +------------- 1 file changed, 1 insertion(+), 13 deletions(-) diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index aa705e5b35..27aa3e6bc7 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -212,7 +212,6 @@ static struct /* Signal handlers */ static void WalSndSigHupHandler(SIGNAL_ARGS); -static void WalSndXLogSendHandler(SIGNAL_ARGS); static void WalSndLastCycleHandler(SIGNAL_ARGS); /* Prototypes for private functions */ @@ -2857,17 +2856,6 @@ WalSndSigHupHandler(SIGNAL_ARGS) errno = save_errno; } -/* SIGUSR1: set flag to send WAL records */ -static void -WalSndXLogSendHandler(SIGNAL_ARGS) -{ - int save_errno = errno; - - latch_sigusr1_handler(); - - errno = save_errno; -} - /* SIGUSR2: set flag to do a last cycle and shut down afterwards */ static void WalSndLastCycleHandler(SIGNAL_ARGS) @@ -2901,7 +2889,7 @@ WalSndSignals(void) pqsignal(SIGQUIT, quickdie); /* hard crash time */ InitializeTimeouts(); /* establishes SIGALRM handler */ pqsignal(SIGPIPE, SIG_IGN); - pqsignal(SIGUSR1, WalSndXLogSendHandler); /* request WAL sending */ + pqsignal(SIGUSR1, procsignal_sigusr1_handler); pqsignal(SIGUSR2, WalSndLastCycleHandler); /* request a last cycle and * shutdown */ -- 2.12.0.264.gd6db3f2165.dirty
>From 01647701515b688ef6bf488430f7785f6ab50414 Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Fri, 2 Jun 2017 14:15:34 -0700 Subject: [PATCH 3/6] Prevent possibility of panics during shutdown checkpoint. When the checkpointer writes the shutdown checkpoint, it checks afterwards whether any WAL has been written since it started and throws a PANIC if so. At that point, only walsenders are still active, so one might think this could not happen, but walsenders can also generate WAL, for instance in BASE_BACKUP and logical decoding related commands (e.g. via hint bits). So they can trigger this panic if such a command is run while the shutdown checkpoint is being written. To fix this, divide the walsender shutdown into two phases. First, checkpointer, itself triggered by postmaster, sends a PROCSIG_WALSND_INIT_STOPPING signal to all walsenders. If the backend is idle or runs an SQL query this causes the backend to shutdown, if logical replication is in progress all existing WAL records are processed followed by a shutdown. Otherwise this causes the walsender to switch to the "stopping" state. In this state, the walsender will reject any further replication commands. The checkpointer begins the shutdown checkpoint once all walsenders are confirmed as stopping. When the shutdown checkpoint finishes, the postmaster sends us SIGUSR2. This instructs walsender to send any outstanding WAL, including the shutdown checkpoint record, wait for it to be replicated to the standby, and then exit. Author: Andres Freund, based on an earlier patch by Michael Paquier Reported-By: Fujii Masao, Andres Freund Discussion: https://postgr.es/m/20170602002912.tqlwn4gymzlxp...@alap3.anarazel.de Backpatch: 9.4, where logical decoding was introduced --- doc/src/sgml/monitoring.sgml | 5 + src/backend/access/transam/xlog.c | 11 ++ src/backend/replication/walsender.c | 188 ++++++++++++++++++++++++---- src/backend/storage/ipc/procsignal.c | 4 + src/include/replication/walsender.h | 3 + src/include/replication/walsender_private.h | 3 +- src/include/storage/procsignal.h | 1 + 7 files changed, 187 insertions(+), 28 deletions(-) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 5640c0d84a..79ca45a156 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -1690,6 +1690,11 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i <literal>backup</>: This WAL sender is sending a backup. </para> </listitem> + <listitem> + <para> + <literal>stopping</>: This WAL sender is stopping. + </para> + </listitem> </itemizedlist> </entry> </row> diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 35ee7d1cb6..70d2570dc2 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -8324,6 +8324,17 @@ ShutdownXLOG(int code, Datum arg) ereport(IsPostmasterEnvironment ? LOG : NOTICE, (errmsg("shutting down"))); + /* + * Signal walsenders to move to stopping state. + */ + WalSndInitStopping(); + + /* + * Wait for WAL senders to be in stopping state. This prevents commands + * from writing new WAL. + */ + WalSndWaitStopping(); + if (RecoveryInProgress()) CreateRestartPoint(CHECKPOINT_IS_SHUTDOWN | CHECKPOINT_IMMEDIATE); else diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index 27aa3e6bc7..ff5aff496d 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -24,11 +24,17 @@ * are treated as not a crash but approximately normal termination; * the walsender will exit quickly without sending any more XLOG records. * - * If the server is shut down, postmaster sends us SIGUSR2 after all - * regular backends have exited and the shutdown checkpoint has been written. - * This instructs walsender to send any outstanding WAL, including the - * shutdown checkpoint record, wait for it to be replicated to the standby, - * and then exit. + * If the server is shut down, checkpointer sends us + * PROCSIG_WALSND_INIT_STOPPING after all regular backends have exited. If + * the backend is idle or runs an SQL query this causes the backend to + * shutdown, if logical replication is in progress all existing WAL records + * are processed followed by a shutdown. Otherwise this causes the walsender + * to switch to the "stopping" state. In this state, the walsender will reject + * any further replication commands. The checkpointer begins the shutdown + * checkpoint once all walsenders are confirmed as stopping. When the shutdown + * checkpoint finishes, the postmaster sends us SIGUSR2. This instructs + * walsender to send any outstanding WAL, including the shutdown checkpoint + * record, wait for it to be replicated to the standby, and then exit. * * * Portions Copyright (c) 2010-2017, PostgreSQL Global Development Group @@ -177,13 +183,14 @@ static bool WalSndCaughtUp = false; /* Flags set by signal handlers for later service in main loop */ static volatile sig_atomic_t got_SIGHUP = false; -static volatile sig_atomic_t walsender_ready_to_stop = false; +static volatile sig_atomic_t got_SIGUSR2 = false; +static volatile sig_atomic_t got_STOPPING = false; /* - * This is set while we are streaming. When not set, SIGUSR2 signal will be - * handled like SIGTERM. When set, the main loop is responsible for checking - * walsender_ready_to_stop and terminating when it's set (after streaming any - * remaining WAL). + * This is set while we are streaming. When not set + * PROCSIG_WALSND_INIT_STOPPING signal will be handled like SIGTERM. When set, + * the main loop is responsible for checking got_STOPPING and terminating when + * it's set (after streaming any remaining WAL). */ static volatile sig_atomic_t replication_active = false; @@ -300,7 +307,8 @@ WalSndErrorCleanup(void) ReplicationSlotCleanup(); replication_active = false; - if (walsender_ready_to_stop) + + if (got_STOPPING || got_SIGUSR2) proc_exit(0); /* Revert back to startup state */ @@ -677,7 +685,7 @@ StartReplication(StartReplicationCmd *cmd) WalSndLoop(XLogSendPhysical); replication_active = false; - if (walsender_ready_to_stop) + if (got_STOPPING) proc_exit(0); WalSndSetState(WALSNDSTATE_STARTUP); @@ -1055,7 +1063,7 @@ StartLogicalReplication(StartReplicationCmd *cmd) { ereport(LOG, (errmsg("terminating walsender process after promotion"))); - walsender_ready_to_stop = true; + got_STOPPING = true; } WalSndSetState(WALSNDSTATE_CATCHUP); @@ -1106,7 +1114,7 @@ StartLogicalReplication(StartReplicationCmd *cmd) ReplicationSlotRelease(); replication_active = false; - if (walsender_ready_to_stop) + if (got_STOPPING) proc_exit(0); WalSndSetState(WALSNDSTATE_STARTUP); @@ -1311,6 +1319,14 @@ WalSndWaitForWal(XLogRecPtr loc) /* Check for input from the client */ ProcessRepliesIfAny(); + /* + * If we're shutting down, trigger pending WAL to be written out, + * otherwise we'd possibly end up waiting for WAL that never gets + * written, because walwriter has shut down already. + */ + if (got_STOPPING) + XLogBackgroundFlush(); + /* Update our idea of the currently flushed position. */ if (!RecoveryInProgress()) RecentFlushPtr = GetFlushRecPtr(); @@ -1326,7 +1342,7 @@ WalSndWaitForWal(XLogRecPtr loc) * RecentFlushPtr, so we can send all remaining data before shutting * down. */ - if (walsender_ready_to_stop) + if (got_STOPPING) break; /* @@ -1401,6 +1417,22 @@ exec_replication_command(const char *cmd_string) MemoryContext old_context; /* + * If WAL sender has been told that shutdown is getting close, switch its + * status accordingly to handle the next replication commands correctly. + */ + if (got_STOPPING) + WalSndSetState(WALSNDSTATE_STOPPING); + + /* + * Throw error if in stopping mode. We need prevent commands that could + * generate WAL while the shutdown checkpoint is being written. To be + * safe, we just prohibit all new commands. + */ + if (MyWalSnd->state == WALSNDSTATE_STOPPING) + ereport(ERROR, + (errmsg("cannot execute new commands while WAL sender is in stopping mode"))); + + /* * CREATE_REPLICATION_SLOT ... LOGICAL exports a snapshot until the next * command arrives. Clean up the old stuff if there's anything. */ @@ -2128,7 +2160,7 @@ WalSndLoop(WalSndSendDataCallback send_data) * normal termination at shutdown, or a promotion, the walsender * is not sure which. */ - if (walsender_ready_to_stop) + if (got_SIGUSR2) WalSndDone(send_data); } @@ -2443,6 +2475,10 @@ XLogSendPhysical(void) XLogRecPtr endptr; Size nbytes; + /* If requested switch the WAL sender to the stopping state. */ + if (got_STOPPING) + WalSndSetState(WALSNDSTATE_STOPPING); + if (streamingDoneSending) { WalSndCaughtUp = true; @@ -2733,7 +2769,16 @@ XLogSendLogical(void) * point, then we're caught up. */ if (logical_decoding_ctx->reader->EndRecPtr >= GetFlushRecPtr()) + { WalSndCaughtUp = true; + + /* + * Have WalSndLoop() terminate the connection in an orderly + * manner, after writing out all the pending data. + */ + if (got_STOPPING) + got_SIGUSR2 = true; + } } /* Update shared memory status */ @@ -2843,6 +2888,26 @@ WalSndRqstFileReload(void) } } +/* + * Handle PROCSIG_WALSND_INIT_STOPPING signal. + */ +void +HandleWalSndInitStopping(void) +{ + Assert(am_walsender); + + /* + * If replication has not yet started, die like with SIGTERM. If + * replication is active, only set a flag and wake up the main loop. It + * will send any outstanding WAL, wait for it to be replicated to the + * standby, and then exit gracefully. + */ + if (!replication_active) + kill(MyProcPid, SIGTERM); + else + got_STOPPING = true; +} + /* SIGHUP: set flag to re-read config file at next convenient time */ static void WalSndSigHupHandler(SIGNAL_ARGS) @@ -2856,22 +2921,17 @@ WalSndSigHupHandler(SIGNAL_ARGS) errno = save_errno; } -/* SIGUSR2: set flag to do a last cycle and shut down afterwards */ +/* + * SIGUSR2: set flag to do a last cycle and shut down afterwards. The WAL + * sender should already have been switched to WALSNDSTATE_STOPPING at + * this point. + */ static void WalSndLastCycleHandler(SIGNAL_ARGS) { int save_errno = errno; - /* - * If replication has not yet started, die like with SIGTERM. If - * replication is active, only set a flag and wake up the main loop. It - * will send any outstanding WAL, wait for it to be replicated to the - * standby, and then exit gracefully. - */ - if (!replication_active) - kill(MyProcPid, SIGTERM); - - walsender_ready_to_stop = true; + got_SIGUSR2 = true; SetLatch(MyLatch); errno = save_errno; @@ -2969,6 +3029,78 @@ WalSndWakeup(void) } } +/* + * Signal all walsenders to move to stopping state. + * + * This will trigger walsenders to send the remaining WAL, prevent them from + * accepting further commands. After that they'll wait till the last WAL is + * written. + */ +void +WalSndInitStopping(void) +{ + int i; + + for (i = 0; i < max_wal_senders; i++) + { + WalSnd *walsnd = &WalSndCtl->walsnds[i]; + pid_t pid; + + SpinLockAcquire(&walsnd->mutex); + pid = walsnd->pid; + SpinLockRelease(&walsnd->mutex); + + if (pid == 0) + continue; + + SendProcSignal(pid, PROCSIG_WALSND_INIT_STOPPING, InvalidBackendId); + } +} + +/* + * Wait that all the WAL senders have reached the stopping state. This is + * used by the checkpointer to control when shutdown checkpoints can + * safely begin. + */ +void +WalSndWaitStopping(void) +{ + for (;;) + { + int i; + bool all_stopped = true; + + for (i = 0; i < max_wal_senders; i++) + { + WalSndState state; + WalSnd *walsnd = &WalSndCtl->walsnds[i]; + + SpinLockAcquire(&walsnd->mutex); + + if (walsnd->pid == 0) + { + SpinLockRelease(&walsnd->mutex); + continue; + } + + state = walsnd->state; + SpinLockRelease(&walsnd->mutex); + + if (state != WALSNDSTATE_STOPPING) + { + all_stopped = false; + break; + } + } + + /* safe to leave if confirmation is done for all WAL senders */ + if (all_stopped) + return; + + pg_usleep(10000L); /* wait for 10 msec */ + } +} + /* Set state for current walsender (only called in walsender) */ void WalSndSetState(WalSndState state) @@ -3002,6 +3134,8 @@ WalSndGetStateString(WalSndState state) return "catchup"; case WALSNDSTATE_STREAMING: return "streaming"; + case WALSNDSTATE_STOPPING: + return "stopping"; } return "UNKNOWN"; } diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c index 4a21d5512d..b9302ac630 100644 --- a/src/backend/storage/ipc/procsignal.c +++ b/src/backend/storage/ipc/procsignal.c @@ -20,6 +20,7 @@ #include "access/parallel.h" #include "commands/async.h" #include "miscadmin.h" +#include "replication/walsender.h" #include "storage/latch.h" #include "storage/ipc.h" #include "storage/proc.h" @@ -270,6 +271,9 @@ procsignal_sigusr1_handler(SIGNAL_ARGS) if (CheckProcSignal(PROCSIG_PARALLEL_MESSAGE)) HandleParallelMessageInterrupt(); + if (CheckProcSignal(PROCSIG_WALSND_INIT_STOPPING)) + HandleWalSndInitStopping(); + if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_DATABASE)) RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_DATABASE); diff --git a/src/include/replication/walsender.h b/src/include/replication/walsender.h index 2ca903872e..c50e450ec2 100644 --- a/src/include/replication/walsender.h +++ b/src/include/replication/walsender.h @@ -44,6 +44,9 @@ extern void WalSndSignals(void); extern Size WalSndShmemSize(void); extern void WalSndShmemInit(void); extern void WalSndWakeup(void); +extern void WalSndInitStopping(void); +extern void WalSndWaitStopping(void); +extern void HandleWalSndInitStopping(void); extern void WalSndRqstFileReload(void); /* diff --git a/src/include/replication/walsender_private.h b/src/include/replication/walsender_private.h index 2c59056cef..36311e124c 100644 --- a/src/include/replication/walsender_private.h +++ b/src/include/replication/walsender_private.h @@ -24,7 +24,8 @@ typedef enum WalSndState WALSNDSTATE_STARTUP = 0, WALSNDSTATE_BACKUP, WALSNDSTATE_CATCHUP, - WALSNDSTATE_STREAMING + WALSNDSTATE_STREAMING, + WALSNDSTATE_STOPPING } WalSndState; /* diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h index d068dde5d7..553f0f43f7 100644 --- a/src/include/storage/procsignal.h +++ b/src/include/storage/procsignal.h @@ -32,6 +32,7 @@ typedef enum PROCSIG_CATCHUP_INTERRUPT, /* sinval catchup interrupt */ PROCSIG_NOTIFY_INTERRUPT, /* listen/notify interrupt */ PROCSIG_PARALLEL_MESSAGE, /* message from cooperating parallel backend */ + PROCSIG_WALSND_INIT_STOPPING, /* ask walsenders to prepare for shutdown */ /* Recovery conflict reasons */ PROCSIG_RECOVERY_CONFLICT_DATABASE, -- 2.12.0.264.gd6db3f2165.dirty
>From b109a13bc9ded7dede6e391040dd4966b65d6b9f Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Sun, 4 Jun 2017 16:14:52 -0700 Subject: [PATCH 4/6] Unify SIGHUP handling between normal and walsender backends. Because walsender and normal backends share the same main loop it's problematic to have two different flag variables, set in signal handlers, indicating a pending configuration reload. Only certain walsender commands reach code paths checking for the variable (START_[LOGICAL_]REPLICATION, CREATE_REPLICATION_SLOT ... LOGICAL, notably not base backups). This is a bug present since the introduction of walsender, but has gotten worse in releases since then which allow walsender to do more. A later patch, not slated for v10, will similarly unify SIGHUP handling in other types of processes as well. Author: Petr Jelinek, Andres Freund Discussion: https://postgr.es/m/20170423235941.qosiuoyqprq4n...@alap3.anarazel.de Backpatch: 9.2-, bug is present since 9.0 --- src/backend/replication/walsender.c | 29 +++++++---------------------- src/backend/tcop/postgres.c | 30 ++++++++++++++---------------- src/backend/utils/init/globals.c | 1 + src/include/miscadmin.h | 5 +++++ 4 files changed, 27 insertions(+), 38 deletions(-) diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index ff5aff496d..a4f754a518 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -182,7 +182,6 @@ static bool streamingDoneReceiving; static bool WalSndCaughtUp = false; /* Flags set by signal handlers for later service in main loop */ -static volatile sig_atomic_t got_SIGHUP = false; static volatile sig_atomic_t got_SIGUSR2 = false; static volatile sig_atomic_t got_STOPPING = false; @@ -218,7 +217,6 @@ static struct } LagTracker; /* Signal handlers */ -static void WalSndSigHupHandler(SIGNAL_ARGS); static void WalSndLastCycleHandler(SIGNAL_ARGS); /* Prototypes for private functions */ @@ -1201,9 +1199,9 @@ WalSndWriteData(LogicalDecodingContext *ctx, XLogRecPtr lsn, TransactionId xid, CHECK_FOR_INTERRUPTS(); /* Process any requests or signals received recently */ - if (got_SIGHUP) + if (ConfigRereadPending) { - got_SIGHUP = false; + ConfigRereadPending = false; ProcessConfigFile(PGC_SIGHUP); SyncRepInitConfig(); } @@ -1309,9 +1307,9 @@ WalSndWaitForWal(XLogRecPtr loc) CHECK_FOR_INTERRUPTS(); /* Process any requests or signals received recently */ - if (got_SIGHUP) + if (ConfigRereadPending) { - got_SIGHUP = false; + ConfigRereadPending = false; ProcessConfigFile(PGC_SIGHUP); SyncRepInitConfig(); } @@ -2101,9 +2099,9 @@ WalSndLoop(WalSndSendDataCallback send_data) CHECK_FOR_INTERRUPTS(); /* Process any requests or signals received recently */ - if (got_SIGHUP) + if (ConfigRereadPending) { - got_SIGHUP = false; + ConfigRereadPending = false; ProcessConfigFile(PGC_SIGHUP); SyncRepInitConfig(); } @@ -2908,19 +2906,6 @@ HandleWalSndInitStopping(void) got_STOPPING = true; } -/* SIGHUP: set flag to re-read config file at next convenient time */ -static void -WalSndSigHupHandler(SIGNAL_ARGS) -{ - int save_errno = errno; - - got_SIGHUP = true; - - SetLatch(MyLatch); - - errno = save_errno; -} - /* * SIGUSR2: set flag to do a last cycle and shut down afterwards. The WAL * sender should already have been switched to WALSNDSTATE_STOPPING at @@ -2942,7 +2927,7 @@ void WalSndSignals(void) { /* Set up signal handlers */ - pqsignal(SIGHUP, WalSndSigHupHandler); /* set flag to read config + pqsignal(SIGHUP, PostgresSigHupHandler); /* set flag to read config * file */ pqsignal(SIGINT, SIG_IGN); /* not used */ pqsignal(SIGTERM, die); /* request shutdown */ diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index 1357769150..70c9f8db59 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -123,13 +123,6 @@ char *register_stack_base_ptr = NULL; #endif /* - * Flag to mark SIGHUP. Whenever the main loop comes around it - * will reread the configuration file. (Better than doing the - * reading in the signal handler, ey?) - */ -static volatile sig_atomic_t got_SIGHUP = false; - -/* * Flag to keep track of whether we have started a transaction. * For extended query protocol this has to be remembered across messages. */ @@ -187,7 +180,6 @@ static bool IsTransactionExitStmt(Node *parsetree); static bool IsTransactionExitStmtList(List *pstmts); static bool IsTransactionStmtList(List *pstmts); static void drop_unnamed_stmt(void); -static void SigHupHandler(SIGNAL_ARGS); static void log_disconnections(int code, Datum arg); @@ -2684,13 +2676,19 @@ FloatExceptionHandler(SIGNAL_ARGS) "invalid operation, such as division by zero."))); } -/* SIGHUP: set flag to re-read config file at next convenient time */ -static void -SigHupHandler(SIGNAL_ARGS) +/* + * SIGHUP: set flag to re-read config file at next convenient time. + * + * Sets the ConfigRereadPending flag, which should be checked at convenient + * places inside main loops. (Better than doing the reading in the signal + * handler, ey?) + */ +void +PostgresSigHupHandler(SIGNAL_ARGS) { int save_errno = errno; - got_SIGHUP = true; + ConfigRereadPending = true; SetLatch(MyLatch); errno = save_errno; @@ -3632,8 +3630,8 @@ PostgresMain(int argc, char *argv[], WalSndSignals(); else { - pqsignal(SIGHUP, SigHupHandler); /* set flag to read config - * file */ + pqsignal(SIGHUP, PostgresSigHupHandler); /* set flag to read config + * file */ pqsignal(SIGINT, StatementCancelHandler); /* cancel current query */ pqsignal(SIGTERM, die); /* cancel current query and exit */ @@ -4046,9 +4044,9 @@ PostgresMain(int argc, char *argv[], * (6) check for any other interesting events that happened while we * slept. */ - if (got_SIGHUP) + if (ConfigRereadPending) { - got_SIGHUP = false; + ConfigRereadPending = false; ProcessConfigFile(PGC_SIGHUP); } diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c index 08b6030a64..f758a94b2f 100644 --- a/src/backend/utils/init/globals.c +++ b/src/backend/utils/init/globals.c @@ -31,6 +31,7 @@ volatile bool QueryCancelPending = false; volatile bool ProcDiePending = false; volatile bool ClientConnectionLost = false; volatile bool IdleInTransactionSessionTimeoutPending = false; +volatile sig_atomic_t ConfigRereadPending = false; volatile uint32 InterruptHoldoffCount = 0; volatile uint32 QueryCancelHoldoffCount = 0; volatile uint32 CritSectionCount = 0; diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h index 4c607b299c..1cd24fd761 100644 --- a/src/include/miscadmin.h +++ b/src/include/miscadmin.h @@ -23,6 +23,8 @@ #ifndef MISCADMIN_H #define MISCADMIN_H +#include <signal.h> + #include "pgtime.h" /* for pg_time_t */ @@ -81,6 +83,7 @@ extern PGDLLIMPORT volatile bool InterruptPending; extern PGDLLIMPORT volatile bool QueryCancelPending; extern PGDLLIMPORT volatile bool ProcDiePending; extern PGDLLIMPORT volatile bool IdleInTransactionSessionTimeoutPending; +extern PGDLLIMPORT volatile sig_atomic_t ConfigRereadPending; extern volatile bool ClientConnectionLost; @@ -273,6 +276,8 @@ extern void restore_stack_base(pg_stack_base_t base); extern void check_stack_depth(void); extern bool stack_is_too_deep(void); +extern void PostgresSigHupHandler(SIGNAL_ARGS); + /* in tcop/utility.c */ extern void PreventCommandIfReadOnly(const char *cmdname); extern void PreventCommandIfParallelMode(const char *cmdname); -- 2.12.0.264.gd6db3f2165.dirty
>From 92d5958f9fda344606a6c453123e70a17e3e671d Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Fri, 2 Jun 2017 16:07:08 -0700 Subject: [PATCH 5/6] Wire up query cancel interrupt for walsender backends. This allows to cancel commands run over replication connections. While it might have some use before v10, it has become important now that normal SQL commands are allowed in database connected walsender connections. Author: Petr Jelinek Reviewed-By: Andres Freund Discussion: https://postgr.es/m/7966f454-7cd7-2b0c-8b70-cdca9d5a8...@2ndquadrant.com --- src/backend/replication/walsender.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index a4f754a518..e132374d13 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -2929,7 +2929,7 @@ WalSndSignals(void) /* Set up signal handlers */ pqsignal(SIGHUP, PostgresSigHupHandler); /* set flag to read config * file */ - pqsignal(SIGINT, SIG_IGN); /* not used */ + pqsignal(SIGINT, StatementCancelHandler); /* query cancel */ pqsignal(SIGTERM, die); /* request shutdown */ pqsignal(SIGQUIT, quickdie); /* hard crash time */ InitializeTimeouts(); /* establishes SIGALRM handler */ -- 2.12.0.264.gd6db3f2165.dirty
>From dd18489d82b656ba3fca0e1be6cab7fd9f2ed429 Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Sun, 4 Jun 2017 16:36:39 -0700 Subject: [PATCH 6/6] Use PostgresSigHupHandler everywhere SIGHUP is handled. --- src/backend/postmaster/autovacuum.c | 30 ++++++++---------------------- src/backend/postmaster/bgwriter.c | 20 +++----------------- src/backend/postmaster/checkpointer.c | 25 +++++-------------------- src/backend/postmaster/pgarch.c | 25 +++++-------------------- src/backend/postmaster/pgstat.c | 28 +++++++--------------------- src/backend/postmaster/startup.c | 7 +++---- src/backend/postmaster/syslogger.c | 20 +++----------------- src/backend/postmaster/walwriter.c | 20 +++----------------- src/backend/replication/logical/launcher.c | 21 +++------------------ src/backend/replication/logical/worker.c | 23 +++-------------------- src/backend/replication/walreceiver.c | 7 +++---- src/backend/utils/misc/guc.c | 9 +++++---- 12 files changed, 51 insertions(+), 184 deletions(-) diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c index 89dd3b321b..e11d353576 100644 --- a/src/backend/postmaster/autovacuum.c +++ b/src/backend/postmaster/autovacuum.c @@ -137,7 +137,6 @@ static bool am_autovacuum_launcher = false; static bool am_autovacuum_worker = false; /* Flags set by signal handlers */ -static volatile sig_atomic_t got_SIGHUP = false; static volatile sig_atomic_t got_SIGUSR2 = false; static volatile sig_atomic_t got_SIGTERM = false; @@ -351,7 +350,6 @@ static void perform_work_item(AutoVacuumWorkItem *workitem); static void autovac_report_activity(autovac_table *tab); static void autovac_report_workitem(AutoVacuumWorkItem *workitem, const char *nspname, const char *relname); -static void av_sighup_handler(SIGNAL_ARGS); static void avl_sigusr2_handler(SIGNAL_ARGS); static void avl_sigterm_handler(SIGNAL_ARGS); static void autovac_refresh_stats(void); @@ -461,7 +459,7 @@ AutoVacLauncherMain(int argc, char *argv[]) * backend, so we use the same signal handling. See equivalent code in * tcop/postgres.c. */ - pqsignal(SIGHUP, av_sighup_handler); + pqsignal(SIGHUP, PostgresSigHupHandler); pqsignal(SIGINT, StatementCancelHandler); pqsignal(SIGTERM, avl_sigterm_handler); @@ -675,9 +673,9 @@ AutoVacLauncherMain(int argc, char *argv[]) if (got_SIGTERM) break; - if (got_SIGHUP) + if (ConfigRereadPending) { - got_SIGHUP = false; + ConfigRereadPending = false; ProcessConfigFile(PGC_SIGHUP); /* shutdown requested in config file? */ @@ -1406,18 +1404,6 @@ AutoVacWorkerFailed(void) AutoVacuumShmem->av_signal[AutoVacForkFailed] = true; } -/* SIGHUP: set flag to re-read config file at next convenient time */ -static void -av_sighup_handler(SIGNAL_ARGS) -{ - int save_errno = errno; - - got_SIGHUP = true; - SetLatch(MyLatch); - - errno = save_errno; -} - /* SIGUSR2: a worker is up and running, or just finished, or failed to fork */ static void avl_sigusr2_handler(SIGNAL_ARGS) @@ -1540,7 +1526,7 @@ AutoVacWorkerMain(int argc, char *argv[]) * backend, so we use the same signal handling. See equivalent code in * tcop/postgres.c. */ - pqsignal(SIGHUP, av_sighup_handler); + pqsignal(SIGHUP, PostgresSigHupHandler); /* * SIGINT is used to signal canceling the current table's vacuum; SIGTERM @@ -2333,9 +2319,9 @@ do_autovacuum(void) /* * Check for config changes before processing each collected table. */ - if (got_SIGHUP) + if (ConfigRereadPending) { - got_SIGHUP = false; + ConfigRereadPending = false; ProcessConfigFile(PGC_SIGHUP); /* @@ -2573,9 +2559,9 @@ deleted: * jobs. */ CHECK_FOR_INTERRUPTS(); - if (got_SIGHUP) + if (ConfigRereadPending) { - got_SIGHUP = false; + ConfigRereadPending = false; ProcessConfigFile(PGC_SIGHUP); } diff --git a/src/backend/postmaster/bgwriter.c b/src/backend/postmaster/bgwriter.c index 2674bb49ba..09a97b912b 100644 --- a/src/backend/postmaster/bgwriter.c +++ b/src/backend/postmaster/bgwriter.c @@ -89,13 +89,11 @@ static XLogRecPtr last_snapshot_lsn = InvalidXLogRecPtr; /* * Flags set by interrupt handlers for later service in the main loop. */ -static volatile sig_atomic_t got_SIGHUP = false; static volatile sig_atomic_t shutdown_requested = false; /* Signal handlers */ static void bg_quickdie(SIGNAL_ARGS); -static void BgSigHupHandler(SIGNAL_ARGS); static void ReqShutdownHandler(SIGNAL_ARGS); static void bgwriter_sigusr1_handler(SIGNAL_ARGS); @@ -120,7 +118,7 @@ BackgroundWriterMain(void) * bgwriter doesn't participate in ProcSignal signalling, but a SIGUSR1 * handler is still needed for latch wakeups. */ - pqsignal(SIGHUP, BgSigHupHandler); /* set flag to read config file */ + pqsignal(SIGHUP, PostgresSigHupHandler); /* set flag to read config file */ pqsignal(SIGINT, SIG_IGN); pqsignal(SIGTERM, ReqShutdownHandler); /* shutdown */ pqsignal(SIGQUIT, bg_quickdie); /* hard crash time */ @@ -259,9 +257,9 @@ BackgroundWriterMain(void) /* Clear any already-pending wakeups */ ResetLatch(MyLatch); - if (got_SIGHUP) + if (ConfigRereadPending) { - got_SIGHUP = false; + ConfigRereadPending = false; ProcessConfigFile(PGC_SIGHUP); } if (shutdown_requested) @@ -432,18 +430,6 @@ bg_quickdie(SIGNAL_ARGS) exit(2); } -/* SIGHUP: set flag to re-read config file at next convenient time */ -static void -BgSigHupHandler(SIGNAL_ARGS) -{ - int save_errno = errno; - - got_SIGHUP = true; - SetLatch(MyLatch); - - errno = save_errno; -} - /* SIGTERM: set flag to shutdown and exit */ static void ReqShutdownHandler(SIGNAL_ARGS) diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c index a55071900d..726c1c2a1d 100644 --- a/src/backend/postmaster/checkpointer.c +++ b/src/backend/postmaster/checkpointer.c @@ -149,7 +149,6 @@ double CheckPointCompletionTarget = 0.5; /* * Flags set by interrupt handlers for later service in the main loop. */ -static volatile sig_atomic_t got_SIGHUP = false; static volatile sig_atomic_t checkpoint_requested = false; static volatile sig_atomic_t shutdown_requested = false; @@ -177,7 +176,6 @@ static void UpdateSharedMemoryConfig(void); /* Signal handlers */ static void chkpt_quickdie(SIGNAL_ARGS); -static void ChkptSigHupHandler(SIGNAL_ARGS); static void ReqCheckpointHandler(SIGNAL_ARGS); static void chkpt_sigusr1_handler(SIGNAL_ARGS); static void ReqShutdownHandler(SIGNAL_ARGS); @@ -205,8 +203,7 @@ CheckpointerMain(void) * want to wait for the backends to exit, whereupon the postmaster will * tell us it's okay to shut down (via SIGUSR2). */ - pqsignal(SIGHUP, ChkptSigHupHandler); /* set flag to read config - * file */ + pqsignal(SIGHUP, PostgresSigHupHandler); /* set flag to read config file */ pqsignal(SIGINT, ReqCheckpointHandler); /* request checkpoint */ pqsignal(SIGTERM, SIG_IGN); /* ignore SIGTERM */ pqsignal(SIGQUIT, chkpt_quickdie); /* hard crash time */ @@ -365,9 +362,9 @@ CheckpointerMain(void) */ AbsorbFsyncRequests(); - if (got_SIGHUP) + if (ConfigRereadPending) { - got_SIGHUP = false; + ConfigRereadPending = false; ProcessConfigFile(PGC_SIGHUP); /* @@ -691,9 +688,9 @@ CheckpointWriteDelay(int flags, double progress) !ImmediateCheckpointRequested() && IsCheckpointOnSchedule(progress)) { - if (got_SIGHUP) + if (ConfigRereadPending) { - got_SIGHUP = false; + ConfigRereadPending = false; ProcessConfigFile(PGC_SIGHUP); /* update shmem copies of config variables */ UpdateSharedMemoryConfig(); @@ -846,18 +843,6 @@ chkpt_quickdie(SIGNAL_ARGS) exit(2); } -/* SIGHUP: set flag to re-read config file at next convenient time */ -static void -ChkptSigHupHandler(SIGNAL_ARGS) -{ - int save_errno = errno; - - got_SIGHUP = true; - SetLatch(MyLatch); - - errno = save_errno; -} - /* SIGINT: set flag to run a normal checkpoint right away */ static void ReqCheckpointHandler(SIGNAL_ARGS) diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c index 2dce39fdef..9b407abe41 100644 --- a/src/backend/postmaster/pgarch.c +++ b/src/backend/postmaster/pgarch.c @@ -73,7 +73,6 @@ static time_t last_sigterm_time = 0; /* * Flags set by interrupt handlers for later service in the main loop. */ -static volatile sig_atomic_t got_SIGHUP = false; static volatile sig_atomic_t got_SIGTERM = false; static volatile sig_atomic_t wakened = false; static volatile sig_atomic_t ready_to_stop = false; @@ -88,7 +87,6 @@ static pid_t pgarch_forkexec(void); NON_EXEC_STATIC void PgArchiverMain(int argc, char *argv[]) pg_attribute_noreturn(); static void pgarch_exit(SIGNAL_ARGS); -static void ArchSigHupHandler(SIGNAL_ARGS); static void ArchSigTermHandler(SIGNAL_ARGS); static void pgarch_waken(SIGNAL_ARGS); static void pgarch_waken_stop(SIGNAL_ARGS); @@ -219,7 +217,7 @@ PgArchiverMain(int argc, char *argv[]) * Ignore all signals usually bound to some action in the postmaster, * except for SIGHUP, SIGTERM, SIGUSR1, SIGUSR2, and SIGQUIT. */ - pqsignal(SIGHUP, ArchSigHupHandler); + pqsignal(SIGHUP, PostgresSigHupHandler); pqsignal(SIGINT, SIG_IGN); pqsignal(SIGTERM, ArchSigTermHandler); pqsignal(SIGQUIT, pgarch_exit); @@ -252,19 +250,6 @@ pgarch_exit(SIGNAL_ARGS) exit(1); } -/* SIGHUP signal handler for archiver process */ -static void -ArchSigHupHandler(SIGNAL_ARGS) -{ - int save_errno = errno; - - /* set flag to re-read config file at next convenient time */ - got_SIGHUP = true; - SetLatch(MyLatch); - - errno = save_errno; -} - /* SIGTERM signal handler for archiver process */ static void ArchSigTermHandler(SIGNAL_ARGS) @@ -341,9 +326,9 @@ pgarch_MainLoop(void) time_to_stop = ready_to_stop; /* Check for config update */ - if (got_SIGHUP) + if (ConfigRereadPending) { - got_SIGHUP = false; + ConfigRereadPending = false; ProcessConfigFile(PGC_SIGHUP); } @@ -444,9 +429,9 @@ pgarch_ArchiverCopyLoop(void) * setting for archive_command as soon as possible, even if there * is a backlog of files to be archived. */ - if (got_SIGHUP) + if (ConfigRereadPending) { - got_SIGHUP = false; + ConfigRereadPending = false; ProcessConfigFile(PGC_SIGHUP); } diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c index f453dade6c..1176dca62e 100644 --- a/src/backend/postmaster/pgstat.c +++ b/src/backend/postmaster/pgstat.c @@ -266,7 +266,6 @@ static List *pending_write_requests = NIL; /* Signal handler flags */ static volatile bool need_exit = false; -static volatile bool got_SIGHUP = false; /* * Total time charged to functions so far in the current backend. @@ -287,7 +286,6 @@ static pid_t pgstat_forkexec(void); NON_EXEC_STATIC void PgstatCollectorMain(int argc, char *argv[]) pg_attribute_noreturn(); static void pgstat_exit(SIGNAL_ARGS); static void pgstat_beshutdown_hook(int code, Datum arg); -static void pgstat_sighup_handler(SIGNAL_ARGS); static PgStat_StatDBEntry *pgstat_get_db_entry(Oid databaseid, bool create); static PgStat_StatTabEntry *pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry, @@ -4186,7 +4184,7 @@ PgstatCollectorMain(int argc, char *argv[]) * except SIGHUP and SIGQUIT. Note we don't need a SIGUSR1 handler to * support latch operations, because we only use a local latch. */ - pqsignal(SIGHUP, pgstat_sighup_handler); + pqsignal(SIGHUP, PostgresSigHupHandler); pqsignal(SIGINT, SIG_IGN); pqsignal(SIGTERM, SIG_IGN); pqsignal(SIGQUIT, pgstat_exit); @@ -4221,10 +4219,10 @@ PgstatCollectorMain(int argc, char *argv[]) * message. (This effectively means that if backends are sending us stuff * like mad, we won't notice postmaster death until things slack off a * bit; which seems fine.) To do that, we have an inner loop that - * iterates as long as recv() succeeds. We do recognize got_SIGHUP inside - * the inner loop, which means that such interrupts will get serviced but - * the latch won't get cleared until next time there is a break in the - * action. + * iterates as long as recv() succeeds. We do recognize + * ConfigRereadPending inside the inner loop, which means that such + * interrupts will get serviced but the latch won't get cleared until next + * time there is a break in the action. */ for (;;) { @@ -4246,9 +4244,9 @@ PgstatCollectorMain(int argc, char *argv[]) /* * Reload configuration if we got SIGHUP from the postmaster. */ - if (got_SIGHUP) + if (ConfigRereadPending) { - got_SIGHUP = false; + ConfigRereadPending = false; ProcessConfigFile(PGC_SIGHUP); } @@ -4439,18 +4437,6 @@ pgstat_exit(SIGNAL_ARGS) errno = save_errno; } -/* SIGHUP handler for collector process */ -static void -pgstat_sighup_handler(SIGNAL_ARGS) -{ - int save_errno = errno; - - got_SIGHUP = true; - SetLatch(MyLatch); - - errno = save_errno; -} - /* * Subroutine to clear stats in a database entry * diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c index b623252475..f57f7970fe 100644 --- a/src/backend/postmaster/startup.c +++ b/src/backend/postmaster/startup.c @@ -38,7 +38,6 @@ /* * Flags set by interrupt handlers for later service in the redo loop. */ -static volatile sig_atomic_t got_SIGHUP = false; static volatile sig_atomic_t shutdown_requested = false; static volatile sig_atomic_t promote_triggered = false; @@ -122,7 +121,7 @@ StartupProcSigHupHandler(SIGNAL_ARGS) { int save_errno = errno; - got_SIGHUP = true; + ConfigRereadPending = true; WakeupRecovery(); errno = save_errno; @@ -150,9 +149,9 @@ HandleStartupProcInterrupts(void) /* * Check if we were requested to re-read config file. */ - if (got_SIGHUP) + if (ConfigRereadPending) { - got_SIGHUP = false; + ConfigRereadPending = false; ProcessConfigFile(PGC_SIGHUP); } diff --git a/src/backend/postmaster/syslogger.c b/src/backend/postmaster/syslogger.c index 9f5ca5cac0..8ee318faee 100644 --- a/src/backend/postmaster/syslogger.c +++ b/src/backend/postmaster/syslogger.c @@ -122,7 +122,6 @@ static CRITICAL_SECTION sysloggerSection; /* * Flags set by interrupt handlers for later service in the main loop. */ -static volatile sig_atomic_t got_SIGHUP = false; static volatile sig_atomic_t rotation_requested = false; @@ -144,7 +143,6 @@ static unsigned int __stdcall pipeThread(void *arg); static void logfile_rotate(bool time_based_rotation, int size_rotation_for); static char *logfile_getname(pg_time_t timestamp, const char *suffix); static void set_next_rotation_time(void); -static void sigHupHandler(SIGNAL_ARGS); static void sigUsr1Handler(SIGNAL_ARGS); static void update_metainfo_datafile(void); @@ -240,7 +238,7 @@ SysLoggerMain(int argc, char *argv[]) * broken backends... */ - pqsignal(SIGHUP, sigHupHandler); /* set flag to read config file */ + pqsignal(SIGHUP, PostgresSigHupHandler); /* set flag to read config file */ pqsignal(SIGINT, SIG_IGN); pqsignal(SIGTERM, SIG_IGN); pqsignal(SIGQUIT, SIG_IGN); @@ -303,9 +301,9 @@ SysLoggerMain(int argc, char *argv[]) /* * Process any requests or signals received recently. */ - if (got_SIGHUP) + if (ConfigRereadPending) { - got_SIGHUP = false; + ConfigRereadPending = false; ProcessConfigFile(PGC_SIGHUP); /* @@ -1421,18 +1419,6 @@ update_metainfo_datafile(void) * -------------------------------- */ -/* SIGHUP: set flag to reload config file */ -static void -sigHupHandler(SIGNAL_ARGS) -{ - int save_errno = errno; - - got_SIGHUP = true; - SetLatch(MyLatch); - - errno = save_errno; -} - /* SIGUSR1: set flag to rotate logfile */ static void sigUsr1Handler(SIGNAL_ARGS) diff --git a/src/backend/postmaster/walwriter.c b/src/backend/postmaster/walwriter.c index a575d8f953..29e00f2890 100644 --- a/src/backend/postmaster/walwriter.c +++ b/src/backend/postmaster/walwriter.c @@ -79,12 +79,10 @@ int WalWriterFlushAfter = 128; /* * Flags set by interrupt handlers for later service in the main loop. */ -static volatile sig_atomic_t got_SIGHUP = false; static volatile sig_atomic_t shutdown_requested = false; /* Signal handlers */ static void wal_quickdie(SIGNAL_ARGS); -static void WalSigHupHandler(SIGNAL_ARGS); static void WalShutdownHandler(SIGNAL_ARGS); static void walwriter_sigusr1_handler(SIGNAL_ARGS); @@ -108,7 +106,7 @@ WalWriterMain(void) * We have no particular use for SIGINT at the moment, but seems * reasonable to treat like SIGTERM. */ - pqsignal(SIGHUP, WalSigHupHandler); /* set flag to read config file */ + pqsignal(SIGHUP, PostgresSigHupHandler); /* set flag to read config file */ pqsignal(SIGINT, WalShutdownHandler); /* request shutdown */ pqsignal(SIGTERM, WalShutdownHandler); /* request shutdown */ pqsignal(SIGQUIT, wal_quickdie); /* hard crash time */ @@ -260,9 +258,9 @@ WalWriterMain(void) /* * Process any requests or signals received recently. */ - if (got_SIGHUP) + if (ConfigRereadPending) { - got_SIGHUP = false; + ConfigRereadPending = false; ProcessConfigFile(PGC_SIGHUP); } if (shutdown_requested) @@ -342,18 +340,6 @@ wal_quickdie(SIGNAL_ARGS) exit(2); } -/* SIGHUP: set flag to re-read config file at next convenient time */ -static void -WalSigHupHandler(SIGNAL_ARGS) -{ - int save_errno = errno; - - got_SIGHUP = true; - SetLatch(MyLatch); - - errno = save_errno; -} - /* SIGTERM: set flag to exit normally */ static void WalShutdownHandler(SIGNAL_ARGS) diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c index 345a415212..d92ee3d3a6 100644 --- a/src/backend/replication/logical/launcher.c +++ b/src/backend/replication/logical/launcher.c @@ -80,7 +80,6 @@ static void logicalrep_worker_detach(void); static void logicalrep_worker_cleanup(LogicalRepWorker *worker); /* Flags set by signal handlers */ -static volatile sig_atomic_t got_SIGHUP = false; static volatile sig_atomic_t got_SIGTERM = false; static bool on_commit_launcher_wakeup = false; @@ -637,20 +636,6 @@ logicalrep_launcher_sigterm(SIGNAL_ARGS) errno = save_errno; } -/* SIGHUP: set flag to reload configuration at next convenient time */ -static void -logicalrep_launcher_sighup(SIGNAL_ARGS) -{ - int save_errno = errno; - - got_SIGHUP = true; - - /* Waken anything waiting on the process latch */ - SetLatch(MyLatch); - - errno = save_errno; -} - /* * Count the number of registered (not necessarily running) sync workers * for a subscription. @@ -799,7 +784,7 @@ ApplyLauncherMain(Datum main_arg) before_shmem_exit(logicalrep_launcher_onexit, (Datum) 0); /* Establish signal handlers. */ - pqsignal(SIGHUP, logicalrep_launcher_sighup); + pqsignal(SIGHUP, PostgresSigHupHandler); pqsignal(SIGTERM, logicalrep_launcher_sigterm); BackgroundWorkerUnblockSignals(); @@ -889,9 +874,9 @@ ApplyLauncherMain(Datum main_arg) if (rc & WL_POSTMASTER_DEATH) proc_exit(1); - if (got_SIGHUP) + if (ConfigRereadPending) { - got_SIGHUP = false; + ConfigRereadPending = false; ProcessConfigFile(PGC_SIGHUP); } diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c index a570900a42..16d3a6f5df 100644 --- a/src/backend/replication/logical/worker.c +++ b/src/backend/replication/logical/worker.c @@ -120,9 +120,6 @@ static void store_flush_position(XLogRecPtr remote_lsn); static void maybe_reread_subscription(void); -/* Flags set by signal handlers */ -static volatile sig_atomic_t got_SIGHUP = false; - /* * Should this worker apply changes for given relation. * @@ -1156,10 +1153,10 @@ LogicalRepApplyLoop(XLogRecPtr last_received) if (rc & WL_POSTMASTER_DEATH) proc_exit(1); - if (got_SIGHUP) + if (ConfigRereadPending) { - got_SIGHUP = false; ProcessConfigFile(PGC_SIGHUP); + ConfigRereadPending = false; } if (rc & WL_TIMEOUT) @@ -1451,20 +1448,6 @@ subscription_change_cb(Datum arg, int cacheid, uint32 hashvalue) MySubscriptionValid = false; } -/* SIGHUP: set flag to reload configuration at next convenient time */ -static void -logicalrep_worker_sighup(SIGNAL_ARGS) -{ - int save_errno = errno; - - got_SIGHUP = true; - - /* Waken anything waiting on the process latch */ - SetLatch(MyLatch); - - errno = save_errno; -} - /* Logical Replication Apply worker entry point */ void ApplyWorkerMain(Datum main_arg) @@ -1480,7 +1463,7 @@ ApplyWorkerMain(Datum main_arg) logicalrep_worker_attach(worker_slot); /* Setup signal handling */ - pqsignal(SIGHUP, logicalrep_worker_sighup); + pqsignal(SIGHUP, PostgresSigHupHandler); pqsignal(SIGTERM, die); BackgroundWorkerUnblockSignals(); diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c index 2723612718..3c7bb49d5c 100644 --- a/src/backend/replication/walreceiver.c +++ b/src/backend/replication/walreceiver.c @@ -95,7 +95,6 @@ static uint32 recvOff = 0; * Flags set by interrupt handlers of walreceiver for later service in the * main loop. */ -static volatile sig_atomic_t got_SIGHUP = false; static volatile sig_atomic_t got_SIGTERM = false; /* @@ -424,9 +423,9 @@ WalReceiverMain(void) /* Process any requests or signals received recently */ ProcessWalRcvInterrupts(); - if (got_SIGHUP) + if (ConfigRereadPending) { - got_SIGHUP = false; + ConfigRereadPending = false; ProcessConfigFile(PGC_SIGHUP); XLogWalRcvSendHSFeedback(true); } @@ -799,7 +798,7 @@ WalRcvDie(int code, Datum arg) static void WalRcvSigHupHandler(SIGNAL_ARGS) { - got_SIGHUP = true; + ConfigRereadPending = true; } diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index 92e1d63b2f..dd20ceab30 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -8965,10 +8965,11 @@ read_nondefault_variables(void) * value before processing serialized values. * * A PGC_S_DEFAULT setting on the serialize side will typically match new - * postmaster children, but that can be false when got_SIGHUP == true and the - * pending configuration change modifies this setting. Nonetheless, we omit - * PGC_S_DEFAULT settings from serialization and make up for that by restoring - * defaults before applying serialized values. + * postmaster children, but that can be false when + * ConfigRereadPending == true and the pending configuration change + * modifies this setting. Nonetheless, we omit PGC_S_DEFAULT settings from + * serialization and make up for that by restoring defaults before applying + * serialized values. * * PGC_POSTMASTER variables always have the same value in every child of a * particular postmaster. Most PGC_INTERNAL variables are compile-time -- 2.12.0.264.gd6db3f2165.dirty
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers