On Wed, Apr 5, 2017 at 6:49 PM, Amit Kapila <amit.kapil...@gmail.com> wrote: > On Wed, Apr 5, 2017 at 12:35 PM, Kuntal Ghosh > <kuntalghosh.2...@gmail.com> wrote: >> On Tue, Apr 4, 2017 at 11:22 PM, Tomas Vondra >>> I'm probably missing something, but I don't quite understand how these >>> values actually survive the crash. I mean, what I observed is OOM followed >>> by a restart, so shouldn't BackgroundWorkerShmemInit() simply reset the >>> values back to 0? Or do we call ForgetBackgroundWorker() after the crash for >>> some reason? >> AFAICU, during crash recovery, we wait for all non-syslogger children >> to exit, then reset shmem(call BackgroundWorkerShmemInit) and perform >> StartupDataBase. While starting the startup process we check if any >> bgworker is scheduled for a restart. >> > > In general, your theory appears right, but can you check how it > behaves in standby server because there is a difference in how the > startup process behaves during master and standby startup? In master, > it stops after recovery whereas in standby it will keep on running to > receive WAL. > While performing StartupDatabase, both master and standby server behave in similar way till postmaster spawns startup process. In master, startup process completes its job and dies. As a result, reaper is called which in turn calls maybe_start_bgworker(). In standby, after getting a valid snapshot, startup process sends postmaster a signal to enable connections. Signal handler in postmaster calls maybe_start_bgworker(). In maybe_start_bgworker(), if we find a crashed bgworker(crashed_at != 0) with a NEVER RESTART flag, we call ForgetBackgroundWorker().to forget the bgworker process.
I've attached the patch for adding an argument in ForgetBackgroundWorker() to indicate a crashed situation. Based on that, we can take the necessary actions. I've not included the Assert statement in this patch. -- Thanks & Regards, Kuntal Ghosh EnterpriseDB: http://www.enterprisedb.com
0001-Fix-parallel-worker-counts-after-a-crash_v1.patch
Description: binary/octet-stream
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers