On Wed, Mar 4, 2020 at 9:14 AM Dilip Kumar <dilipbal...@gmail.com> wrote: > > On Wed, Mar 4, 2020 at 3:16 AM Tomas Vondra > <tomas.von...@2ndquadrant.com> wrote: > > > > > > The first thing I realized that WAL-logging of assignments in v12 does > > both the "old" logging (using dedicated message) and "new" with > > toplevel-XID embedded in the first message. Yes, the patch was wrong, > > because it eliminated all calls to ProcArrayApplyXidAssignment() and so > > it was trivial to crash the replica due to KnownAssignedXids overflow. > > But I don't think re-introducing XLOG_XACT_ASSIGNMENT message is the > > right fix. > > > > I actually proposed doing this (having both ways to log assignments) so > > that there's no regression risk with (wal_level < logical). But IIRC > > Andres objected to it, argumenting that we should not log the same piece > > of information in two very different ways at the same time (IIRC it was > > discussed on the FOSDEM dev meeting, so I don't have a link to share). > > And I do agree with him ... > > > > The question is, why couldn't the replica use the same assignment info > > we already write for logical decoding? The main challenge is that now > > the assignment can be sent in many different xlog messages, from a bunch > > of resource managers (essentially, any xlog message with a xid can have > > embedded XID of the toplevel xact). So the handling would either need to > > happen in every rmgr, or we need to move it before we call the rmgr. > > > > For exampple, we might do this e.g. in StartupXLOG() I think, per the > > attached patch (FWIW this particular fix was written by Masahiko Sawada, > > not me). This does the trick for me - I'm no longer able to reproduce > > the KnownAssignedXids overflow. > > > > The one difference is that we used to call ProcArrayApplyXidAssignment > > for larger groups of XIDs, as sent in the assignment message. Now we > > call it for each individual assignment. I don't know if this is an > > issue, but I suppose we might introduce some sort of local caching > > (accumulate the assignments into a local array, call the function only > > when we have enough of them). > > Thanks for the pointers, I will think over these points. >
I have looked at the solution proposed and I would like to share my findings. I think calling ProcArrayApplyXidAssignment for each subtransaction is not a good idea for a couple of reasons: (a) It will just beat the purpose of maintaining KnowAssignedXids array which is to avoid looking at pg_subtrans in TransactionIdIsInProgress() on standby. Basically, if we remove it for each subXid, it will consider the KnowAssignedXids to be overflowed and check pg_subtrans frequently. (b) Calling ProcArrayApplyXidAssignment() for each subtransaction can be costly from the perspective of concurrency because it acquires ProcArrayLock in Exclusive mode, so concurrently running transactions might start blocking at this lock. Also, I see that SubTransSetParent() makes the page dirty, so it might lead to more writes if we spread out setting that by calling it separately for each sub-transaction. Apart from this, I don't see how the proposed fix is correct because as far as I can see it tries to remove the Xid before we even record it via RecordKnownAssignedTransactionIds(). It seems after patch RecordKnownAssignedTransactionIds() will be called after ProcArrayApplyXidAssignment(), how could that be correct. Thoughts? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com