The first patch on my "Refactoring backend fork+exec code" thread [0]
changes the allocations of BackgroundWorkerList from plain malloc() to
MemoryContextAlloc(PostmasterContext). However, that actually caused a
segfault in worker_spi tests in EXEC_BACKEND mode.
BackgroundWorkerList is a postmaster-private data structure and should
not be accessed in backends. That assumption failed in
RegisterBackgroundWorker(). When you put worker_spi in
shared_preload_libraries, its _PG_init() function calls
RegisterBackgroundWorker(), as expected. But in EXEC_BACKEND mode, the
library is loaded *again* in each backend process, and each of those
loads also call RegisterBackgroundWorker(). It's too late to correctly
register any static background workers at that stage, but
RegisterBackgroundWorker() still goes through the motions and adds the
element to BackgroundWorkerList. If you change the malloc() to
MemoryContextAlloc(PostmasterContext), it segfaults because
PostmasterContext == NULL in a backend process.
In summary, RegisterBackgroundWorker() is doing some questionable and
useless work, when a shared preload library is loaded to a backend
process in EXEC_BACKEND mode.
Attached patches:
1. Tighten/clarify those checks. See also commit message for details.
2. The patch from the other thread to switch to
MemoryContextAlloc(PostmasterContext)
3. A fix for a highly misleading comment in the same file.
Any comments?
[0]
https://www.postgresql.org/message-id/flat/7a59b073-5b5b-151e-7ed3-8b01ff7ce9ef%40iki.fi
P.S. In addition to those, I also considered these changes but didn't
implement them yet:
- Change RegisterBackgroundWorker() to return true/false to indicate
whether the registration succeeded. Currently, the caller has no way of
knowing. In many cases, I think even an ERROR and refusing to start up
the server would be appropriate. But at least we should let the caller
know and decide.
- Add "Assert(am_postmaster)" assertions to the functions in bgworker.c
that are only supposed to be called in postmaster process. The functions
have good explicit comments on that, but wouldn't hurt to also assert.
(There is no 'am_postmaster' flag, the equivalent is actually
(IsPostmasterEnvironment && !IsUnderPostmaster), but perhaps we should
define a macro or flag for that)
--
Heikki Linnakangas
Neon (https://neon.tech)
From b6475511099646f1fab0ba3cfcf5651e52de634b Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakan...@iki.fi>
Date: Thu, 24 Aug 2023 18:08:39 +0300
Subject: [PATCH 1/3] Clarify the checks in RegisterBackgroundWorker.
In EXEC_BACKEND or single-user mode, we process
shared_preload_libraries at the postmaster startup as usual, but also
at backend startup. When a library calls RegisterBackgroundWorker()
when being loaded into a backend process, we go through the motions to
add the worker to BackgroundWorkerList, even though that is a
postmaster-private data structure. Make it return early when called in
a backend process, without changing BackgroundWorkerList.
You could argue that it was intentional: In non-EXEC_BACKEND mode, the
backend processes inherit BackgroundWorkerList at fork(), so it does
make some sense to initialize it to the same state in EXEC_BACKEND
mode, too. It's clearly a postmaster-private structure, though, and
all the functions that use it are clearly marked as "should only be
called in postmaster".
You could also argue that libraries should not call
RegisterBackgroundWorker() during backend startup. It's too late to
correctly register any static background workers at that stage. But
it's a common pattern in extensions, and it doesn't seem worth the
churn to require all extensions to change it.
Another sloppiness was the exception for "internal" background
workers. We checked that RegisterBackgroundWorker() was called during
shared_preload_libraries processing, or the background worker was an
internal one. That exception was made in commit 665d1fad99 to allow
postmaster to register the logical apply launcher in
ApplyLauncherRegister(). The way the check was written, it would not
complain if you registered an internal background worker in a regular
backend process. But it would complain if postmaster registered a
background worker defined in a shared library, outside
shared_preload_libraries processing. I think the correct rule is that
you can only register static background workers in the postmaster
process, and only before the bgworker shared memory array has been
initialized. Check for that more directly.
---
src/backend/postmaster/bgworker.c | 42 +++++++++++++++++++++++--------
1 file changed, 32 insertions(+), 10 deletions(-)
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 505e38376c3..41220a05189 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -880,21 +880,43 @@ RegisterBackgroundWorker(BackgroundWorker *worker)
RegisteredBgWorker *rw;
static int numworkers = 0;
- if (!IsUnderPostmaster)
- ereport(DEBUG1,
- (errmsg_internal("registering background worker \"%s\"", worker->bgw_name)));
-
- if (!process_shared_preload_libraries_in_progress &&
- strcmp(worker->bgw_library_name, "postgres") != 0)
+ /*
+ * Static background workers can only be registered in the postmaster process.
+ */
+ if (IsUnderPostmaster || !IsPostmasterEnvironment)
{
- if (!IsUnderPostmaster)
- ereport(LOG,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("background worker \"%s\": must be registered in shared_preload_libraries",
+ /*
+ * In EXEC_BACKEND or single-user mode, we process
+ * shared_preload_libraries in backend processes too. We cannot
+ * register static background workers at that stage, but many
+ * libraries' _PG_init() functions don't distinguish whether they're
+ * being loaded in the postmaster or in a backend, they just check
+ * process_shared_preload_libraries_in_progress. It's a bit sloppy,
+ * but for historical reasons we tolerate it. In EXEC_BACKEND mode,
+ * the background workers should already have been registered when the
+ * library was loaded in postmaster.
+ */
+ if (process_shared_preload_libraries_in_progress)
+ return;
+ ereport(LOG,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("background worker \"%s\": must be registered in shared_preload_libraries",
worker->bgw_name)));
return;
}
+ /*
+ * Cannot register static background workers after must be registered
+ * before calling BackgroundWorkerShmemInit(). This is a programmer
+ * mistake, so hard ERROR.
+ */
+ if (BackgroundWorkerData != NULL)
+ elog(ERROR, "cannot register background worker \"%s\" after shmem init",
+ worker->bgw_name);
+
+ ereport(DEBUG1,
+ (errmsg_internal("registering background worker \"%s\"", worker->bgw_name)));
+
if (!SanityCheckBackgroundWorker(worker, LOG))
return;
--
2.39.2
From 2aca6bb94a488c90660a3097f820453fbeaf73b7 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakan...@iki.fi>
Date: Thu, 24 Aug 2023 18:08:44 +0300
Subject: [PATCH 2/3] Allocate Backend structs in PostmasterContext.
The child processes don't need them. By allocating them in
PostmasterContext, the memory gets free'd and is made available for
other stuff in the child processes.
---
src/backend/postmaster/bgworker.c | 11 ++++++++---
src/backend/postmaster/postmaster.c | 24 +++++++++++-------------
2 files changed, 19 insertions(+), 16 deletions(-)
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 41220a05189..f8519d71c5c 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -33,6 +33,7 @@
#include "storage/shmem.h"
#include "tcop/tcopprot.h"
#include "utils/ascii.h"
+#include "utils/memutils.h"
#include "utils/ps_status.h"
#include "utils/timeout.h"
@@ -347,7 +348,9 @@ BackgroundWorkerStateChange(bool allow_new_workers)
/*
* Copy the registration data into the registered workers list.
*/
- rw = malloc(sizeof(RegisteredBgWorker));
+ rw = MemoryContextAllocExtended(PostmasterContext,
+ sizeof(RegisteredBgWorker),
+ MCXT_ALLOC_NO_OOM);
if (rw == NULL)
{
ereport(LOG,
@@ -455,7 +458,7 @@ ForgetBackgroundWorker(slist_mutable_iter *cur)
rw->rw_worker.bgw_name)));
slist_delete_current(cur);
- free(rw);
+ pfree(rw);
}
/*
@@ -951,7 +954,9 @@ RegisterBackgroundWorker(BackgroundWorker *worker)
/*
* Copy the registration data into the registered workers list.
*/
- rw = malloc(sizeof(RegisteredBgWorker));
+ rw = MemoryContextAllocExtended(PostmasterContext,
+ sizeof(RegisteredBgWorker),
+ MCXT_ALLOC_NO_OOM);
if (rw == NULL)
{
ereport(LOG,
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 41bccb46a87..d456f64c5c2 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -3331,7 +3331,7 @@ CleanupBackgroundWorker(int pid,
*/
if (rw->rw_backend->bgworker_notify)
BackgroundWorkerStopNotifications(rw->rw_pid);
- free(rw->rw_backend);
+ pfree(rw->rw_backend);
rw->rw_backend = NULL;
rw->rw_pid = 0;
rw->rw_child_slot = 0;
@@ -3424,7 +3424,7 @@ CleanupBackend(int pid,
BackgroundWorkerStopNotifications(bp->pid);
}
dlist_delete(iter.cur);
- free(bp);
+ pfree(bp);
break;
}
}
@@ -3480,7 +3480,7 @@ HandleChildCrash(int pid, int exitstatus, const char *procname)
#ifdef EXEC_BACKEND
ShmemBackendArrayRemove(rw->rw_backend);
#endif
- free(rw->rw_backend);
+ pfree(rw->rw_backend);
rw->rw_backend = NULL;
rw->rw_pid = 0;
rw->rw_child_slot = 0;
@@ -3517,7 +3517,7 @@ HandleChildCrash(int pid, int exitstatus, const char *procname)
#endif
}
dlist_delete(iter.cur);
- free(bp);
+ pfree(bp);
/* Keep looping so we can signal remaining backends */
}
else
@@ -4093,7 +4093,7 @@ BackendStartup(Port *port)
* Create backend data structure. Better before the fork() so we can
* handle failure cleanly.
*/
- bn = (Backend *) malloc(sizeof(Backend));
+ bn = (Backend *) palloc_extended(sizeof(Backend), MCXT_ALLOC_NO_OOM);
if (!bn)
{
ereport(LOG,
@@ -4109,7 +4109,7 @@ BackendStartup(Port *port)
*/
if (!RandomCancelKey(&MyCancelKey))
{
- free(bn);
+ pfree(bn);
ereport(LOG,
(errcode(ERRCODE_INTERNAL_ERROR),
errmsg("could not generate random cancel key")));
@@ -4139,8 +4139,6 @@ BackendStartup(Port *port)
pid = fork_process();
if (pid == 0) /* child */
{
- free(bn);
-
/* Detangle from postmaster */
InitPostmasterChild();
@@ -4171,7 +4169,7 @@ BackendStartup(Port *port)
if (!bn->dead_end)
(void) ReleasePostmasterChildSlot(bn->child_slot);
- free(bn);
+ pfree(bn);
errno = save_errno;
ereport(LOG,
(errmsg("could not fork new process for connection: %m")));
@@ -5434,7 +5432,7 @@ StartAutovacuumWorker(void)
return;
}
- bn = (Backend *) malloc(sizeof(Backend));
+ bn = (Backend *) palloc_extended(sizeof(Backend), MCXT_ALLOC_NO_OOM);
if (bn)
{
bn->cancel_key = MyCancelKey;
@@ -5461,7 +5459,7 @@ StartAutovacuumWorker(void)
* logged by StartAutoVacWorker
*/
(void) ReleasePostmasterChildSlot(bn->child_slot);
- free(bn);
+ pfree(bn);
}
else
ereport(LOG,
@@ -5706,7 +5704,7 @@ do_start_bgworker(RegisteredBgWorker *rw)
/* undo what assign_backendlist_entry did */
ReleasePostmasterChildSlot(rw->rw_child_slot);
rw->rw_child_slot = 0;
- free(rw->rw_backend);
+ pfree(rw->rw_backend);
rw->rw_backend = NULL;
/* mark entry as crashed, so we'll try again later */
rw->rw_crashed_at = GetCurrentTimestamp();
@@ -5832,7 +5830,7 @@ assign_backendlist_entry(RegisteredBgWorker *rw)
return false;
}
- bn = malloc(sizeof(Backend));
+ bn = palloc_extended(sizeof(Backend), MCXT_ALLOC_NO_OOM);
if (bn == NULL)
{
ereport(LOG,
--
2.39.2
From c7cb84880f5e7fb0af968fa65808ad34efb320b4 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakan...@iki.fi>
Date: Thu, 24 Aug 2023 18:11:32 +0300
Subject: [PATCH 3/3] Fix misleading comment on StartBackgroundWorker().
StartBackgroundWorker() is in fact called in the child process, pretty
early in the child process initialization. I guess you could interpret
"called from postmaster" to mean that, but it seems wrong to me.
---
src/backend/postmaster/bgworker.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index f8519d71c5c..c725bdc9a52 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -743,8 +743,7 @@ bgworker_die(SIGNAL_ARGS)
/*
* Start a new background worker
*
- * This is the main entry point for background worker, to be called from
- * postmaster.
+ * This is the main entry point for a background worker process
*/
void
StartBackgroundWorker(void)
--
2.39.2