On Tue, Jan 25, 2022 at 11:35 PM David G. Johnston <david.g.johns...@gmail.com> wrote: > > On Tue, Jan 25, 2022 at 5:52 AM Peter Eisentraut > <peter.eisentr...@enterprisedb.com> wrote: >> >> On 25.01.22 06:18, Amit Kapila wrote: >> > I think to avoid this we can send a message to clear this (at least to >> > clear XID in the view) after skipping the xact but there is no >> > guarantee that it will be received by the stats collector. >> > Additionally, the worker can periodically (say after every N (100, >> > 500, etc) successful transaction) send a clear message after >> > successful apply. This will ensure that eventually the error entry >> > will be cleared. >> >> Well, I think we need *some* solution for now. We can't leave a footgun >> where you say, "skip transaction 700", somehow transaction 700 doesn't >> happen, the whole thing gets forgotten, but then 3 months later, the >> next transaction 700 mysteriously gets dropped. > > > This is indeed part of why I feel that the xid being skipped should be > validated. As the feature is presented the user is supposed to read the xid > from the system (the new stat view or the error log) and supply it and then > the worker, when it goes to skip, should find that the very first transaction > xid it encounters is the one it is being told to skip. It skips that > transaction, clears the skipxid, and puts the system back into normal > operating mode. If that first transaction xid isn't the one being specified > to skip the worker should error with "skipping transaction failed, xid 123 > expected but 456 found".
Yeah, I think it's a good idea to clear the subskipxid after the first transaction regardless of whether the worker skipped it. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/