On Tue, Nov 28, 2023 at 04:03:49PM -0800, Andres Freund wrote: > On 2023-11-28 16:05:16 -0600, Nathan Bossart wrote: >> From a glance, it looks to me like the problem is that pgstat_shutdown_hook >> is registered as a before_shmem_exit callback, while ProcKill is registered >> as an on_shmem_exit callback. > > That's required, as pgstat_shutdown_hook() needs to acquire lwlocks, which you > can't after ProcKill(). It's also not unique to pgstat, several other > before_shmem_exit() callbacks acquire lwlocks (e.g. AtProcExit_Twophase(), > ParallelWorkerShutdown(), do_pg_abort_backup(), the first three uses of > before_shmem_exit when git grepping) - which makes sense, they are presumably > before_shmem_exit() because they need to manage shared state, which often > needs locks. > > In normal backends this is fine-ish, because ShutdownPostgres() is registered > very late (and thus is called early in the shutdown sequence), and the > AbortOutOfAnyTransaction() contained therein indirectly calls > LWLockReleaseAll() and very little happens outside of the transaction context.
Right. Perhaps we could add a LWLockReleaseAll() to pgstat_shutdown_hook() instead of the autovacuum code, but I'm afraid that is still just a hack. >> I would expect your patch to fix this particular issue, but I'm wondering >> whether there's a bigger problem here. > > Yes, there is - our subsystem initialization, shutdown, error recovery > infrastructure is a mess. We've interwoven transaction handling far too > tightly with error handling, the order of subystem initialization is basically > random and differs between operating systems (due to EXEC_BACKEND) and "mode" > of execution (the order differs when using single user mode) and we've > distributed error recovery into ~10 places (all the sigsetjmp()s in backend > code, xact.c and and a few other places like WalSndErrorCleanup()). :( I do remember looking into uniting all the various sigsetjmp() calls before. That could be worth another try. The rest will probably require additional thought... -- Nathan Bossart Amazon Web Services: https://aws.amazon.com