Hi, On 2022-02-12 13:25:58 +0000, Simon Riggs wrote: > On Fri, 11 Feb 2022 at 19:08, Andres Freund <and...@anarazel.de> wrote: > > > On 2022-02-11 13:48:59 +0000, Simon Riggs wrote: > > > On Fri, 11 Feb 2022 at 08:48, Simon Riggs <simon.ri...@enterprisedb.com> > > > wrote: > > > > > > > > On Fri, 11 Feb 2022 at 06:11, Andres Freund <and...@anarazel.de> wrote: > > > > > > > > > Looks lik syncrep will make this a lot worse, because it can > > > > > drastically > > > > > increase the window between the TransactionIdCommitTree() and > > > > > ProcArrayEndTransaction() due to the SyncRepWaitForLSN() inbetween. > > > > > But at > > > > > least it might make it easier to write tests exercising this > > > > > scenario... > > > > > > > > Agreed > > > > > > > > TransactionIdIsKnownCompleted(xid) is only broken because the single > > > > item cache is set too early in some cases. The single item cache is > > > > important for performance, so we just need to be more careful about > > > > setting the cache. > > > > > > Something like this... fix_cachedFetchXid.v1.patch prevents the cache > > > being set, but this fails! Not worked out why, yet. > > > > I don't think it's safe to check !TransactionIdKnownNotInProgress() in > > TransactionLogFetch(), given that TransactionIdKnownNotInProgress() needs to > > do TransactionLogFetch() internally. > > That's not correct because you're confusing > TransactionIdKnownNotInProgress(), which is a new function introduced > by the patch, with the existing function > TransactionIdIsKnownCompleted().
I don't think so. This call of the new TransactionIdKnownNotInProgress() > --- a/src/backend/access/transam/transam.c > +++ b/src/backend/access/transam/transam.c > @@ -73,6 +73,14 @@ TransactionLogFetch(TransactionId transactionId) > return TRANSACTION_STATUS_ABORTED; > } > > + /* > + * Safeguard that we have called TransactionIsIdInProgress() before > + * checking commit log manager, to ensure that we do not cache the > + * result until the xid is no longer in the procarray at eoxact. > + */ > + if (!TransactionIdKnownNotInProgress(transactionId)) > + return TRANSACTION_STATUS_IN_PROGRESS; > + > /* > * Get the transaction status. > */ isn't right. Consider what happens to TransactionIdIsInProgress(x)'s call to TransactionIdDidAbort(x) at "step 4". TransactionIdDidAbort(x) will call TransactionLogFetch(x). cachedXidIsNotInProgress isn't yet set to x, so TransactionLogFetch() returns TRANSACTION_STATUS_IN_PROGRESS. Even if the (sub)transaction aborted. Which explains why your first patch doesn't work. > > > just_remove_TransactionIdIsKnownCompleted_call.v1.patch > > > > I think this might contain a different diff than what the name suggests? > > Not at all, please don't be confused by the name. The patch removes > the call to TransactionIdIsKnownCompleted() from > TransactionIdIsInProgress(). I guess for me "just remove" doesn't include adding a new cache... > I'm not sure it is possible to remove TransactionIdIsKnownCompleted() > in backbranches. I think it'd be fine if we needed to. Or we could just make it always return false or such. > > > just removes the known offending call, passes make check, but IMHO > > > leaves the same error just as likely by other callers. > > > > Which other callers are you referring to? > > I don't know of any, but we can't just remove a function in a > backbranch, can we? We have in the past, if it's a "sufficiently internal" function. Greetings, Andres Freund