On Mon, Sep 11, 2023 at 12:08 PM Michael Paquier <mich...@paquier.xyz> wrote: > > On Fri, Sep 08, 2023 at 11:50:37AM +0530, Amit Kapila wrote: > > On Fri, Sep 8, 2023 at 10:08 AM Michael Paquier <mich...@paquier.xyz> wrote: > >> > >> > >> + /* > >> + * This is used to track the last persisted confirmed_flush LSN value > >> to > >> + * detect any divergence in the in-memory and on-disk values for the > >> same. > >> + */ > >> > >> "This value tracks is the last LSN where this slot's data has been > >> flushed to disk. > >> > > > > This makes the comment vague as this sounds like we are saving a slot > > corresponding to some LSN which is not the case. If you prefer this > > tone then we can instead say: "This value tracks the last > > confirmed_flush LSN flushed which is used during a checkpoint shutdown > > to decide if a logical slot's data should be forcibly flushed or not." > > Okay, that looks like an improvement over the term "persisted". >
Changed accordingly. > >> This is used during a checkpoint shutdown to decide > >> if a logical slot's data should be forcibly flushed or not." > >> > >> Hmm. WAL senders are shut down *after* the checkpointer and *after* > >> the shutdown checkpoint is finished (see PM_SHUTDOWN and > >> PM_SHUTDOWN_2) because we want the WAL senders to acknowledge the > >> checkpoint record before shutting down the primary. > >> > > > > As per my understanding, this is not true for logical walsenders. As > > per code, while HandleCheckpointerInterrupts(), we call ShutdownXLOG() > > which sends a signal to walsender to stop and waits for it to stop. > > And only after that, did it write a shutdown checkpoint WAL record. > > After getting the InitStopping signal, walsender sets got_STOPPING > > flag. Then *logical* walsender ensures that it sends all the pending > > WAL and exits. What you have quoted is probably true for physical > > walsenders. > > Hm, reminding me about this area.. This roots down to the handling of > WalSndCaughtUp in the send_data callback for logical or physical. > This is switched to true for logical WAL senders much earlier than > physical WAL senders, aka before the shutdown checkpoint begins in the > latter. What was itching me a bit is that the postmaster logic could > be made more solid. Logical and physical WAL senders are both marked > as BACKEND_TYPE_WALSND, but we don't actually check that the WAL > senders remaining at the end of PM_SHUTDOWN_2 are *not* connected to a > database. This would require a new BACKEND_TYPE_* perhaps, or perhaps > we're fine with the current state because we'll catch up problems in > the checkpointer if any WAL is generated while the shutdown checkpoint > is running anyway. Just something I got in mind, unrelated to this > patch. > I don't know if the difference is worth inventing a new BACKEND_TYPE_* but if you think so then we can probably discuss this in a new thread. I think we may want to improve some comments as a separate patch to make this evident. > > + * We can flush dirty replication slots at regular intervals by any > + * background process like bgwriter but checkpoint is a convenient location. > > I still don't quite see a need to mention the bgwriter at all here.. > That's just unrelated. > I don't disagree with it, so changed it in the attached patch. > The comment block in CheckPointReplicationSlots() from v10 uses > "persist", but you mean "flush", I guess.. > This point is not very clear to me. Can you please quote the exact comment if you think something needs to be changed? -- With Regards, Amit Kapila.
v11-0001-Persist-logical-slots-to-disk-during-a-shutdown-.patch
Description: Binary data