On Tue, Jan 11, 2022 at 7:08 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Tue, Jan 11, 2022 at 1:51 PM Masahiko Sawada <sawada.m...@gmail.com> wrote: > > > > On Tue, Jan 11, 2022 at 3:12 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > > > On Tue, Jan 11, 2022 at 8:52 AM Masahiko Sawada <sawada.m...@gmail.com> > > > wrote: > > > > > > > > On Mon, Jan 10, 2022 at 8:50 PM Amit Kapila <amit.kapil...@gmail.com> > > > > wrote: > > > > > > > > > > I was thinking what if we don't advance origin explicitly in this > > > > > case? Actually, that will be no different than the transactions where > > > > > the apply worker doesn't apply any change because the initial sync is > > > > > in progress (see should_apply_changes_for_rel()) or we have received > > > > > an empty transaction. In those cases also, the origin lsn won't be > > > > > advanced even though we acknowledge the advanced last_received > > > > > location because of keep_alive messages. Now, it is possible after the > > > > > restart we send the old start_lsn location because the replication > > > > > origin was not updated before restart but we handle that case in the > > > > > server by starting from the last confirmed location. See below code: > > > > > > > > > > CreateDecodingContext() > > > > > { > > > > > .. > > > > > else if (start_lsn < slot->data.confirmed_flush) > > > > > .. > > > > > > > > Good point. Probably one minor thing that is different from the > > > > transaction where the apply worker applied an empty transaction is a > > > > case where the server restarts/crashes before sending an > > > > acknowledgment of the flush location. That is, in the case of the > > > > empty transaction, the publisher sends an empty transaction again. On > > > > the other hand in the case of skipping the transaction, a non-empty > > > > transaction will be sent again but skip_xid is already changed or > > > > cleared, therefore the user will have to specify skip_xid again. If we > > > > write replication origin WAL record to advance the origin lsn, it > > > > reduces the possibility of that. But I think it’s a very minor case so > > > > we won’t need to deal with that. > > > > > > > > > > Yeah, in the worst case, it will lead to conflict again and the user > > > needs to set the xid again. > > > > On second thought, the same is true for other cases, for example, > > preparing the transaction and clearing skip_xid while handling a > > prepare message. That is, currently we don't clear skip_xid while > > handling a prepare message but do that while handling commit/rollback > > prepared message, in order to avoid the worst case. If we do both > > while handling a prepare message and the server crashes between them, > > it ends up that skip_xid is cleared and the transaction will be > > resent, which is identical to the worst-case above. > > > > How are you thinking to update the skip xid before prepare? If we do > it in the same transaction then the changes in the catalog will be > part of the prepared xact but won't be committed. Now, say if we do it > after prepare, then the situation won't be the same because after > restart the same xact won't appear again.
I was thinking to commit the catalog change first in a separate transaction while not updating origin LSN and then prepare an empty transaction while updating origin LSN. If the server crashes between them, the skip_xid is cleared but the transaction will be resent. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/