Hi, On 2023-06-18 14:22:33 +0300, Heikki Linnakangas wrote: > I started to look at the code in postmaster.c related to launching child > processes. I tried to reduce the difference between EXEC_BACKEND and > !EXEC_BACKEND code paths, and put the code that needs to differ behind a > better abstraction. I started doing this to help with implementing > multi-threading, but it doesn't introduce anything thread-related yet and I > think this improves readability anyway.
Yes please! This code is absolutely awful. > From 0cb6f8d665980d30a5d2a29013000744f16bf813 Mon Sep 17 00:00:00 2001 > From: Heikki Linnakangas <heikki.linnakan...@iki.fi> > Date: Sun, 18 Jun 2023 11:00:21 +0300 > Subject: [PATCH 3/9] Refactor CreateSharedMemoryAndSemaphores. > > Moves InitProcess calls a little later in EXEC_BACKEND case. What's the reason for this part? ISTM that we'd really want to get away from plastering duplicated InitProcess() etc everywhere. I think this might be easier to understand if you just changed did the CreateSharedMemoryAndSemaphores() -> AttachSharedMemoryAndSemaphores() piece in this commit, and the rest later. > +void > +AttachSharedMemoryAndSemaphores(void) > +{ > + /* InitProcess must've been called already */ Perhaps worth an assertion to make it easier to see that the order is wrong? > From 1d89eec53c7fefa7a4a8c011c9f19e3df64dc436 Mon Sep 17 00:00:00 2001 > From: Heikki Linnakangas <heikki.linnakan...@iki.fi> > Date: Mon, 12 Jun 2023 16:33:20 +0300 > Subject: [PATCH 4/9] Use FD_CLOEXEC on ListenSockets > > We went through some effort to close them in the child process. Better to > not hand them down to the child process in the first place. I think Thomas has a larger version of this patch: https://postgr.es/m/CA%2BhUKGKPNFcfBQduqof4-7C%3DavjcSfdkKBGvQoRuAvfocnvY0A%40mail.gmail.com > From 65384b9a6cfb3b9b589041526216e0f64d64bea5 Mon Sep 17 00:00:00 2001 > From: Heikki Linnakangas <heikki.linnakan...@iki.fi> > Date: Sun, 18 Jun 2023 13:56:44 +0300 > Subject: [PATCH 8/9] Introduce ClientSocket, rename some funcs > > - Move more of the work on a client socket to the child process. > > - Reduce the amount of data that needs to be passed from postmaster to > child. (Used to pass a full Port struct, although most of the fields were > empty. Now we pass the much slimmer ClientSocket.) I think there might be extensions accessing Port. Not sure if it's worth worrying about, but ... > --- a/src/backend/postmaster/autovacuum.c > +++ b/src/backend/postmaster/autovacuum.c > @@ -476,8 +476,8 @@ AutoVacLauncherMain(int argc, char *argv[]) > pqsignal(SIGCHLD, SIG_DFL); > > /* > - * Create a per-backend PGPROC struct in shared memory. We must do > - * this before we can use LWLocks. > + * Create a per-backend PGPROC struct in shared memory. We must do this > + * before we can use LWLocks. > */ > InitProcess(); > Don't think this was intentional? > From b33cfeb28a5419045acb659a01410b2b463bea3e Mon Sep 17 00:00:00 2001 > From: Heikki Linnakangas <heikki.linnakan...@iki.fi> > Date: Sun, 18 Jun 2023 13:59:48 +0300 > Subject: [PATCH 9/9] Refactor postmaster child process launching > > - Move code related to launching backend processes to new source file, > process_start.c I think you might have renamed this to launch_backend.c? > - Introduce new postmaster_child_launch() function that deals with the > differences between EXEC_BACKEND and fork mode. > > - Refactor the mechanism of passing informaton from the parent to > child process. Instead of using different command-line arguments > when launching the child process in EXEC_BACKEND mode, pass a > variable-length blob of data along with all the global > variables. The contents of that blob depends on the kind of child > process being launched. In !EXEC_BACKEND mode, we use the same blob, > but it's simply inherited from the parent to child process. > +const PMChildEntry entry_kinds[] = { > + {"backend", BackendMain, true}, > + > + {"autovacuum launcher", AutoVacLauncherMain, true}, > + {"autovacuum worker", AutoVacWorkerMain, true}, > + {"bgworker", BackgroundWorkerMain, true}, > + {"syslogger", SysLoggerMain, false}, > + > + {"startup", StartupProcessMain, true}, > + {"bgwriter", BackgroundWriterMain, true}, > + {"archiver", PgArchiverMain, true}, > + {"checkpointer", CheckpointerMain, true}, > + {"wal_writer", WalWriterMain, true}, > + {"wal_receiver", WalReceiverMain, true}, > +}; I'd assign them with the PostmasterChildType as index, so there's no danger of getting out of order. const PMChildEntry entry_kinds = { [PMC_AV_LAUNCHER] = {"autovacuum launcher", AutoVacLauncherMain, true}, ... } or such should work. I'd also use designated initializers for the fields, it's otherwise hard to know what true means etc. I think it might be good to put more into array. If we e.g. knew whether a particular child type is a backend-like, and aux process or syslogger, we could avoid the duplicated InitAuxiliaryProcess(), MemoryContextDelete(PostmasterContext) etc calls everywhere. > +/* > + * SubPostmasterMain -- Get the fork/exec'd process into a state equivalent > + * to what it would be if we'd simply forked on Unix, and > then > + * dispatch to the appropriate place. > + * > + * The first two command line arguments are expected to be "--forkFOO" > + * (where FOO indicates which postmaster child we are to become), and > + * the name of a variables file that we can read to load data that would > + * have been inherited by fork() on Unix. Remaining arguments go to the > + * subprocess FooMain() routine. XXX > + */ > +void > +SubPostmasterMain(int argc, char *argv[]) > +{ > + PostmasterChildType child_type; > + char *startup_data; > + size_t startup_data_len; > + > + /* In EXEC_BACKEND case we will not have inherited these settings */ > + IsPostmasterEnvironment = true; > + whereToSendOutput = DestNone; > + > + /* Setup essential subsystems (to ensure elog() behaves sanely) */ > + InitializeGUCOptions(); > + > + /* Check we got appropriate args */ > + if (argc < 3) > + elog(FATAL, "invalid subpostmaster invocation"); > + > + if (strncmp(argv[1], "--forkchild=", 12) == 0) > + { > + char *entry_name = argv[1] + 12; > + bool found = false; > + > + for (int idx = 0; idx < lengthof(entry_kinds); idx++) > + { > + if (strcmp(entry_kinds[idx].name, entry_name) == 0) > + { > + child_type = idx; > + found = true; > + break; > + } > + } > + if (!found) > + elog(ERROR, "unknown child kind %s", entry_name); > + } Hm, shouldn't we error out when called without --forkchild? > +/* Save critical backend variables into the BackendParameters struct */ > +#ifndef WIN32 > +static bool > +save_backend_variables(BackendParameters *param, ClientSocket *client_sock) > +#else There's so much of this kind of thing. Could we hide it in a struct or such instead of needing ifdefs everywhere? > --- a/src/backend/storage/ipc/shmem.c > +++ b/src/backend/storage/ipc/shmem.c > @@ -144,6 +144,8 @@ InitShmemAllocation(void) > /* > * Initialize ShmemVariableCache for transaction manager. (This doesn't > * really belong here, but not worth moving.) > + * > + * XXX: we really should move this > */ > ShmemVariableCache = (VariableCache) > ShmemAlloc(sizeof(*ShmemVariableCache)); Heh. Indeed. And probably just rename it to something less insane. Greetings, Andres Freund