On Mon, Nov 27, 2023 at 4:17 PM Tomas Vondra <tomas.von...@enterprisedb.com> wrote: > > On 11/27/23 11:13, Amit Kapila wrote: > > On Mon, Nov 27, 2023 at 11:34 AM Amit Kapila <amit.kapil...@gmail.com> > > wrote: > >> > >> On Mon, Nov 27, 2023 at 6:41 AM Tomas Vondra > >> <tomas.von...@enterprisedb.com> wrote: > >>> > >>> While going over 0001, I realized there might be an optimization for > >>> ReorderBufferSequenceIsTransactional. As coded in 0001, it always > >>> searches through all top-level transactions, and if there's many of them > >>> that might be expensive, even if very few of them have any relfilenodes > >>> in the hash table. It's still linear search, and it needs to happen for > >>> each sequence change. > >>> > >>> But can the relfilenode even be in some other top-level transaction? How > >>> could it be - our transaction would not see it, and wouldn't be able to > >>> generate the sequence change. So we should be able to simply check *our* > >>> transaction (or if it's a subxact, the top-level transaction). Either > >>> it's there (and it's transactional change), or not (and then it's > >>> non-transactional change). > >>> > >> > >> I also think the relfilenode should be part of either the current > >> top-level xact or one of its subxact, so looking at all the top-level > >> transactions for each change doesn't seem advisable. > >> > >>> The 0004 does this. > >>> > >>> This of course hinges on when exactly the transactions get created, and > >>> assignments processed. For example if this would fire before the txn > >>> gets assigned to the top-level one, this would break. I don't think this > >>> can happen thanks to the immediate logging of assignments, but I'm too > >>> tired to think about it now. > >>> > >> > >> This needs some thought because I think we can't guarantee the > >> association till we reach the point where we can actually decode the > >> xact. See comments in AssertTXNLsnOrder() [1]. > >> > > I suppose you mean the comment before the SnapBuildXactNeedsSkip call, > which says: > > /* > * Skip the verification if we don't reach the LSN at which we start > * decoding the contents of transactions yet because until we reach > * the LSN, we could have transactions that don't have the association > * between the top-level transaction and subtransaction yet and > * consequently have the same LSN. We don't guarantee this > * association until we try to decode the actual contents of > * transaction. The ordering of the records prior to the > * start_decoding_at LSN should have been checked before the restart. > */ > > But doesn't this say that after we actually start decoding / stop > skipping, we should have seen the assignment? We're already decoding > transaction contents (because sequence change *is* part of xact, even if > we decide to replay it in the non-transactional way). >
It means to say that the assignment is decided after start_decoding_at point. We haven't decided that we are past start_decoding_at by the time the patch is computing the transactional flag. > > > > I am wondering that instead of building the infrastructure to know > > whether a particular change is transactional on the decoding side, > > can't we have some flag in the WAL record to note whether the change > > is transactional or not? I have discussed this point with my colleague > > Kuroda-San and we thought that it may be worth exploring whether we > > can use rd_createSubid/rd_newRelfilelocatorSubid in RelationData to > > determine if the sequence is created/changed in the current > > subtransaction and then record that in WAL record. By this, we need to > > have additional information in the WAL record like XLOG_SEQ_LOG but we > > can probably do it only with wal_level as logical. > > > > I may not understand the proposal exactly, but it's not enough to know > if it was created in the same subxact. It might have been created in > some earlier subxact in the same top-level xact. > We should be able to detect even some earlier subxact or top-level xact based on rd_createSubid/rd_newRelfilelocatorSubid. > FWIW I think one of the earlier patch versions did something like this, > by adding a "created" flag in the xlog record. And we concluded doing > this on the decoding side is a better solution. > oh, I thought it would be much simpler than what we are doing on the decoding-side. Can you please point me to the email discussion where this is concluded or share the reason? -- With Regards, Amit Kapila.