On Wed, Nov 21, 2018 at 04:09:41PM +0900, Michael Paquier wrote: > The checkpointer initializes a shutdown checkpoint where it tells to all > the WAL senders to stop once all the children processes are gone, so it > seems to me that there is little point in processing > SyncRepReleaseWaiters() when a WAL sender is in WALSNDSTATE_STOPPING > state as at this stage syncrep makes little sense. It is still > necessary to process standby messages at this stage so as the primary > can shut down when it is sure that all the standbys have flushed the > shutdown checkpoint record of the primary.
Just refreshed my memory with c6c33343, which is actually at the origin of the issue, and my previous argument is flawed. If a WAL sender has reached WALSNDSTATE_STOPPING no regular backends are around but a WAL sender could always commit a transaction in parallel which may need to make sure that its record is flushed and sync'ed, and this needs to make sure that waiters are correctly released. So it is necessary to patch up SyncRepGetSyncStandbysPriority and SyncRepGetSyncStandbysQuorum as mentioned upthread, perhaps adding a comment when looking at MyWalSnd->state looks adapted. Paul, would you like to write a patch? -- Michael
signature.asc
Description: PGP signature