On Thu, 21 Oct 2021 at 05:01, Kyotaro Horiguchi <horikyota....@gmail.com> wrote: > > At Wed, 20 Oct 2021 08:55:12 -0700, Stan Hu <sta...@gmail.com> wrote in > > On Wed, Oct 20, 2021 at 4:00 AM Andrey Borodin <x4...@yandex-team.ru> wrote: > > > > > > > > > > > > 17 окт. 2021 г., в 21:55, Dmitry Dolgov <9erthali...@gmail.com> > > > написал(а): > > > > I wonder what would be side > > > > effects of clearing it when the snapshot is not suboverfloved anymore? > > > > > > I think we should just invalidate lastOverflowedXid on every > > > XLOG_RUNNING_XACTS if subxid_overflow == false. I can't find a reason not > > > to do so.
I believe that to be an incorrect fix, but so very nearly correct. There is a documented race condition in the generation of a XLOG_RUNNING_XACTS that means there could be a new overflow event after the snapshot was taken but before it was logged. > > On a replica, I think it's possible for lastOverflowedXid to be set even if > > subxid_overflow is false on the primary and secondary ( > > https://github.com/postgres/postgres/blob/dc899146dbf0e1d23fb24155a5155826ddce34c9/src/backend/storage/ipc/procarray.c#L1327). > > I thought subxid_overflow only gets set if there are more than > > PGPROC_MAX_CACHED_SUBXIDS (64) used in a given transaction. > > > > Should the replica be invalidating lastOverflowedXid if subxcnt goes to > > zero in XLOG_RUNNING_XACTS? But if there's an outstanding snapshot with an > > xmin that precedes lastOverflowedXid we might violate MVCC if we invalidate > > this, so I wonder if we also need to check the snapshot with the lowest > > xmin? > > lastOverflowedXid is the smallest subxid that possibly exists but > possiblly not known to the standby. So if all top-level transactions > older than lastOverflowedXid end, that means that all the > subtransactions in doubt are known to have been ended. Agreed > XLOG_RUNNING_XACTS reports oldestRunningXid, which is the oldest > running top-transaction. Standby expires xids in KnownAssignedXids > array that precede to the oldestRunningXid. We are sure that all > possiblly-overflown subtransactions are gone as well if the oldest xid > is newer than the first overflowed subtransaction. Agreed > As a cross check, the following existing code in GetSnapshotData means > that no overflow is not happening if the smallest xid in the known > assigned list is larger than lastOverflowedXid, which agrees to the > consideration above. > > procaray.c:2428 > > subcount = KnownAssignedXidsGetAndSetXmin(snapshot->subxip, > > &xmin, > > > > xmax); > > > > if (TransactionIdPrecedesOrEquals(xmin, > > procArray->lastOverflowedXid)) > > suboverflowed = true; > > > If the discussion so far is correct, the following diff will fix the > issue. > > diff --git a/src/backend/storage/ipc/procarray.c > b/src/backend/storage/ipc/procarray.c > index bd3c7a47fe..19682b73ec 100644 > --- a/src/backend/storage/ipc/procarray.c > +++ b/src/backend/storage/ipc/procarray.c > @@ -4463,6 +4463,12 @@ ExpireOldKnownAssignedTransactionIds(TransactionId xid) > { > LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); > KnownAssignedXidsRemovePreceding(xid); > + /* > + * reset lastOverflowedXid if we know transactions that have been > possiblly > + * running are being gone. > + */ > + if (TransactionIdPrecedes(procArray->lastOverflowedXid, xid)) > + procArray->lastOverflowedXid = InvalidTransactionId; > LWLockRelease(ProcArrayLock); > } So I agree with this fix. It is however, an undocumented modularity violation. I think that is acceptable because of the ProcArrayLock traffic, but needs to have a comment to explain this at the call to ExpireOldKnownAssignedTransactionIds() i.e. " and potentially reset lastOverflowedXid", as well as a comment on the ExpireOldKnownAssignedTransactionIds() function. -- Simon Riggs http://www.EnterpriseDB.com/