On 2018-07-26 20:24:00 +0530, Nikhil Sontakke wrote: > Hi, > > >> I think we even can just do something like a global > >> TransactionId check_if_transaction_is_alive = InvalidTransactionId; > >> and just set it up during decoding. And then just check it whenever it's > >> not set tot InvalidTransactionId. > >> > >> > > > > Ok. I will work on something along these lines and re-submit the set of > > patches.
> PFA, latest patchset, which completely removes the earlier > LogicalLock/LogicalUnLock implementation using groupDecode stuff and > uses the newly suggested approach of checking the currently decoded > XID for abort in systable_* API functions. Much simpler to code and > easier to test as well. So, leaving the fact that it might not actually be correct aside ;), you seem to be ok with the approach? > Out of the patchset, the specific patch which focuses on the above > systable_* API based XID checking implementation is part of > 0003-Gracefully-handle-concurrent-aborts-of-uncommitted-t.patch. So, > it might help to take a look at this patch first for any additional > feedback on this approach. K. > There's an additional test case in > 0005-Additional-test-case-to-demonstrate-decoding-rollbac.patch which > uses a sleep in the "change" plugin API to allow a concurrent rollback > on the 2PC being currently decoded. Andres generally doesn't like this > approach :-), but there are no timing/interlocking issues now, and the > sleep just helps us do a concurrent rollback, so it might be ok now, > all things considered. Anyways, it's an additional patch for now. Yea, I still don't think it's ok. The tests won't be reliable. There's ways to make this reliable, e.g. by forcing a lock to be acquired that's externally held or such. Might even be doable just with a weird custom datatype. > From 75edeb440794fff7de48082dafdecb065940bee5 Mon Sep 17 00:00:00 2001 > From: Nikhil Sontakke <nikh...@2ndquadrant.com> > Date: Thu, 26 Jul 2018 18:45:26 +0530 > Subject: [PATCH 3/5] Gracefully handle concurrent aborts of uncommitted > transactions that are being decoded alongside. > > When a transaction aborts, it's changes are considered unnecessary for > other transactions. That means the changes may be either cleaned up by > vacuum or removed from HOT chains (thus made inaccessible through > indexes), and there may be other such consequences. > > When decoding committed transactions this is not an issue, and we > never decode transactions that abort before the decoding starts. > > But for in-progress transactions - for example when decoding prepared > transactions on PREPARE (and not COMMIT PREPARED as before), this > may cause failures when the output plugin consults catalogs (both > system and user-defined). > > We handle such failures by returning ERRCODE_TRANSACTION_ROLLBACK > sqlerrcode from system table scan APIs to the backend decoding a > specific uncommitted transaction. The decoding logic on the receipt > of such an sqlerrcode aborts the ongoing decoding and returns > gracefully. > --- > src/backend/access/index/genam.c | 31 > +++++++++++++++++++++++++ > src/backend/replication/logical/reorderbuffer.c | 30 ++++++++++++++++++++---- > src/backend/utils/time/snapmgr.c | 25 ++++++++++++++++++-- > src/include/utils/snapmgr.h | 4 +++- > 4 files changed, 82 insertions(+), 8 deletions(-) > > diff --git a/src/backend/access/index/genam.c > b/src/backend/access/index/genam.c > index 9d08775687..67c5810bf7 100644 > --- a/src/backend/access/index/genam.c > +++ b/src/backend/access/index/genam.c > @@ -423,6 +423,16 @@ systable_getnext(SysScanDesc sysscan) > else > htup = heap_getnext(sysscan->scan, ForwardScanDirection); > > + /* > + * If CheckXidAlive is valid, then we check if it aborted. If it did, we > + * error out > + */ > + if (TransactionIdIsValid(CheckXidAlive) && > + TransactionIdDidAbort(CheckXidAlive)) > + ereport(ERROR, > + (errcode(ERRCODE_TRANSACTION_ROLLBACK), > + errmsg("transaction aborted during system > catalog scan"))); > + > return htup; > } Don't we have to check TransactionIdIsInProgress() first? C.f. header comments in tqual.c. Note this is also not guaranteed to be correct after a crash (where no clog entry will exist for an aborted xact), but we probably shouldn't get here in that case - but better be safe. I suspect it'd be better reformulated as TransactionIdIsValid(CheckXidAlive) && !TransactionIdIsInProgress(CheckXidAlive) && !TransactionIdDidCommit(CheckXidAlive) What do you think? I think it'd also be good to add assertions to codepaths not going through systable_* asserting that !TransactionIdIsValid(CheckXidAlive). Alternatively we could add an if (unlikely(TransactionIdIsValid(CheckXidAlive)) && ...) branch to those too. > From 80fc576bda483798919653991bef6dc198625d90 Mon Sep 17 00:00:00 2001 > From: Nikhil Sontakke <nikh...@2ndquadrant.com> > Date: Wed, 13 Jun 2018 16:31:15 +0530 > Subject: [PATCH 4/5] Teach test_decoding plugin to work with 2PC > > Includes a new option "enable_twophase". Depending on this options > value, PREPARE TRANSACTION will either be decoded or treated as > a single phase commit later. FWIW, I don't think I'm ok with doing this on a per-plugin-option basis. I think this is something that should be known to the outside of the plugin. More similar to how binary / non-binary support works. Should also be able to inquire the output plugin whether it's supported (cf previous similarity). > From 682b0de2827d1f55c4e471c3129eb687ae0825a5 Mon Sep 17 00:00:00 2001 > From: Nikhil Sontakke <nikh...@2ndquadrant.com> > Date: Wed, 13 Jun 2018 16:32:16 +0530 > Subject: [PATCH 5/5] Additional test case to demonstrate decoding/rollback > interlocking > > Introduce a decode-delay parameter in the test_decoding plugin. Based > on the value provided in the plugin, sleep for those many seconds while > inside the "decode change" plugin call. A concurrent rollback is fired > off which aborts that transaction in the meanwhile. A subsequent > systable access will error out causing the logical decoding to abort. Yea, I'm *definitely* still not on board with this. This'll just lead to a fragile or extremely slow test. Greetings, Andres Freund