On Mon, May 26, 2025 at 10:36 PM Alexander Korotkov <aekorot...@gmail.com> wrote: > > On Mon, May 26, 2025 at 2:43 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > On Mon, May 26, 2025 at 3:52 PM Vitaly Davydov <v.davy...@postgrespro.ru> > > wrote: > > > OTOH, if we don't want to adjust physical > > slot machinery, it seems saving the logical slots to disk immediately > > when its restart_lsn is updated is a waste of effort after your patch, > > no? If so, why are we okay with that? > > I don't think so. I think the reason why logical slots are synced to > disk immediately after update is that logical changes are not > idempotent (you can't safely apply the same change twice) unlike > physical block-level changes. This is why logical slots need to be > synced to prevent double replication of same changes, which could > lead, for example, to double insertion. >
Hmm, if this has to be true, then even in the else branch of LogicalConfirmReceivedLocation [1], we should have saved the slot. AFAIU, whether the logical changes are sent to the client is decided based on two things: (a) the replication origins, which tracks replication progress and are maintained by clients (which for built-in replication are subscriber nodes), see [2]; and (b) confirmed_flush LSN maintained in the slot by the server. Now, for each ack by the client after applying/processing changes, we update the confirmed_flush LSN of the slot but don't immediately flush it. This shouldn't let us send the changes again because even if the system crashes and restarts, the client will send the server the location to start sending the changes from based on its origin tracking. There is more to it, like there are cases when confirm_flush LSN in the slot could be ahead the origin's LSN, and we handle all such cases, but I don't think those are directly related here, so I am skipping those details for now. Note that LogicalConfirmReceivedLocation won't save the slot to disk if it updates only confirmed_flush LSN, which is used to decide whether to send the changes. > LogicalConfirmReceivedLocation() implements immediate sync for > different reasons. > I may be missing something, but let's discuss some more before we conclude this. [1]: else { SpinLockAcquire(&MyReplicationSlot->mutex); /* * Prevent moving the confirmed_flush backwards. See comments above * for the details. */ if (lsn > MyReplicationSlot->data.confirmed_flush) MyReplicationSlot->data.confirmed_flush = lsn; SpinLockRelease(&MyReplicationSlot->mutex); } [2]: https://www.postgresql.org/docs/devel/replication-origins.html -- With Regards, Amit Kapila.