On Thu, Nov 28, 2024 at 09:04:36PM +0100, Michail Nikolaev wrote:
> I discovered that a similar issue was previously addressed by Heikki in
> commit [0], where installcheck was disabled for injection point tests.
> However, in the meson build configuration, this was only applied to
> regression tests - the isolation and TAP tests are still running during
> installcheck.

I fail to see how this is related?  The original issue was that this
was impossible to run safely concurrently, but now we have the
facilities able to do so.  There are a few cases where using a wait
point has limits, for example outside a transaction context for some
of the processes, but that has not really been an issue up to now.

> As demonstrated in the previously shared reproducer [1], even *local*
> injection points can cause backend crashes through unexpected side effects.
> Therefore, I propose extending the installcheck disable to cover both TAP
> and isolation tests as well.
> 
> I've attached a patch implementing these changes.

@@ -426,6 +427,7 @@ InvalidateCatalogSnapshot(void)
         pairingheap_remove(&RegisteredSnapshots, &CatalogSnapshot->ph_node);
         CatalogSnapshot = NULL;
         SnapshotResetXmin();
+        INJECTION_POINT("invalidate_catalog_snapshot_end");
[...]
+step s4_attach_locally { SELECT
injection_points_attach('invalidate_catalog_snapshot_end', 'wait'); }
[...]
#4  0x0000563f09d22f39 in ExceptionalCondition
(conditionName=0x563f0a072d00 
"TransactionIdIsValid(proc->xid)", fileName=0x563f0a072a4a
"procarray.c", lineNumber=677)     at assert.c:66
#5  0x0000563f096f0684 in ProcArrayEndTransaction
(proc=0x7fbd4d083ac0, latestXid=750) at procarray.c:677
#6  0x0000563f088c54f3 in AbortTransaction () at xact.c:2946
#7  0x0000563f088c758d in AbortCurrentTransactionInternal () at
#xact.c:3531
#8  0x0000563f088c72a6 in AbortCurrentTransaction () at xact.c:3449
#9  0x0000563f0979c0f7 in PostgresMain (dbname=0x563f0f128100
"isolation_regression", username=0x563f0f1280e8 "ioltas") at
postgres.c:4524
#10 0x0000563f0978a5e5 in BackendMain (startup_data=0x7ffdcf50cfa8 "",
startup_data_len=4) at backend_startup.c:107
#11 0x0000563f094d8613 in postmaster_child_launch
(child_type=B_BACKEND, child_slot=1,
startup_data=0x7ffdcf50cfa8 "", startup_data_len=4,
client_sock=0x7ffdcf50cfe0) 

Isn't that pointing to an actual bug with serializable transactions?

What you are telling here is that there is a race condition where it
is possible to trigger an assertion failure when finishing a session
while another one is waiting on an invalidation, if there's in the mix
a read/write dependency error.  Disabling the test hides the problem,
it does not fix it.  And we should do the latter, not the former.
--
Michael

Attachment: signature.asc
Description: PGP signature

Reply via email to