Hi Tomas, > 1) There's a race condition in LogicalLockTransaction. The code does > roughly this: > > if (!BecomeDecodeGroupMember(...)) > ... bail out ... > > Assert(MyProc->decodeGroupLeader); > lwlock = LockHashPartitionLockByProc(MyProc->decodeGroupLeader); > ... > > but AFAICS there is no guarantee that the transaction does not commit > (or even abort) right after the become decode group member. In which > case LogicalDecodeRemoveTransaction might have already reset our pointer > to a leader to NULL. In which case the Assert() and lock will fail. > > I've initially thought this can be fixed by setting decodeLocked=true in > BecomeDecodeGroupMember, but that's not really true - that would fix the > race for aborts, but not commits. LogicalDecodeRemoveTransaction skips > the wait for commits entirely, and just resets the flags anyway. > > So this needs a different fix, I think. BecomeDecodeGroupMember also > needs the leader PGPROC pointer, but it does not have the issue because > it gets it as a parameter. I think the same thing would work for here > too - that is, use the AssignDecodeGroupLeader() result instead. >
That's a good catch. One of the earlier patches had a check for this (it also had an ill-placed assert above though) which we removed as part of the ongoing review. Instead of doing the above, we can just re-check if the decodeGroupLeader pointer has become NULL and if so, re-assert that the leader has indeed gone away before returning false. I propose a diff like below. /* * If we were able to add ourself, then Abort processing will - * interlock with us. + * interlock with us. If the leader was done in the meanwhile + * it could have removed us and gone away as well. */ - Assert(MyProc->decodeGroupLeader); + if (MyProc->decodeGroupLeader == NULL) + { + Assert(BackendXidGetProc(txn->xid) == NULL); + return false + } > > 2) BecomeDecodeGroupMember sets the decodeGroupLeader=NULL when the > leader does not match the parameters, despite enforcing it by Assert() > at the beginning. Let's remove that assignment. > Ok, done. > > 3) I don't quite understand why BecomeDecodeGroupMember does the > cross-check using PID. In which case would it help? > When I wrote this support, I had written it with the intention of supporting both 2PC (in which case pid is 0) and in-progress regular transactions. That's why the presence of PID in these functions. The current use case is just for 2PC, so we could remove it. > > 4) AssignDecodeGroupLeader still sets pid, which is never read. Remove. > Ok, will do. > > 5) ReorderBufferCommitInternal does elog(LOG) about interrupting the > decoding of aborted transaction only in one place. There are about three > other places where we check LogicalLockTransaction. Seems inconsistent. > Note that I have added it for the OPTIONAL test_decoding test cases (which AFAIK we don't plan to commit in that state) which demonstrate concurrent rollback interlocking with the lock/unlock APIs. The first ELOG was enough to catch the interaction. If we think these elogs should be present in the code, then yes, I can add it elsewhere as well as part of an earlier patch. > > 6) The comment before LogicalLockTransaction is somewhat inaccurate, > because it talks about adding/removing the backend to the group, but > that's not what's happening. We join the group on the first call and > then we only tweak the decodeLocked flag. > True. > > 7) I propose minor changes to a couple of comments. > Ok, I am looking at your provided patch and incorporating relevant changes from it. WIll submit a patch set soon. Regards, Nikhils -- Nikhil Sontakke http://www.2ndQuadrant.com/ PostgreSQL/Postgres-XL Development, 24x7 Support, Training & Services