On Wed, Aug 14, 2024 at 10:20 AM Heikki Linnakangas <hlinn...@iki.fi> wrote: > On 14/08/2024 04:51, Alexander Korotkov wrote: > > On Tue, Aug 13, 2024 at 10:15 PM Alexander Korotkov > > <aekorot...@gmail.com> wrote: > >> On Tue, Aug 13, 2024 at 9:39 PM Heikki Linnakangas <hlinn...@iki.fi> wrote: > >>> > >>> This causes an assertion failure when executed in a hot standby server: > >>> > >>> select * from pg_check_visible('pg_database'); > >>> > >>> TRAP: failed Assert("!RecoveryInProgress()"), File: > >>> "../src/backend/storage/ipc/procarray.c", Line: 2710, PID: 1142572 > >>> > >>> GetStrictOldestNonRemovableTransactionId does this: > >>> > >>>> if (rel == NULL || rel->rd_rel->relisshared || > >>>> RecoveryInProgress()) > >>>> { > >>>> /* Shared relation: take into account all running xids */ > >>>> runningTransactions = GetRunningTransactionData(); > >>>> LWLockRelease(ProcArrayLock); > >>>> LWLockRelease(XidGenLock); > >>>> return runningTransactions->oldestRunningXid; > >>>> } > >>> > >>> And GetRunningTransactionData() has this: > >>> > >>>> Assert(!RecoveryInProgress()); > >>> > >>> So it's easy to see that you will hit that assertion. > >> > >> Oh, thank you! > >> I'll fix this and add a test for recovery! > > > > Attached patch fixes the problem and adds the corresponding test. I > > would appreciate if you take a look at it. > > The code changes seem fine. I think the "Ignore KnownAssignedXids" > comment above the function could be made more clear. It's not wrong, but > I think it doesn't explain the reasoning very well: > > * We are now doing no effectively no checking in a standby, because we > always just use nextXid. It's better than nothing, I suppose it will > catch very broken cases where an XID is in the future, but that's all. > > * We *could* use KnownAssignedXids for shared catalogs, because with > shared catalogs, the global horizon is used, not a database-aware one. > > * Then again, there might be rare corner cases that a transaction has > crashed in the primary without writing a commit/abort record, and hence > it looks like it's still running in the standby but has already ended in > the primary. So I think it's good we ignore KnownAssignedXids for shared > catalogs anyway.
Thank you for the detailed explanation. I've updated the GetStrictOldestNonRemovableTransactionId() header comment accordingly. I'm going to push this if no objections. ------ Regards, Alexander Korotkov Supabase
v2-0001-Fix-GetStrictOldestNonRemovableTransactionId-on-s.patch
Description: Binary data