Thanks for reviewing! On 06/11/2025 12:24, Álvaro Herrera wrote:
I've been studying the code for freezing of items at vacuum-truncate time, and one thing that jumps at me is that perhaps we ought to be more proactive in freezing items immediately as they are processed. We can do that for any transactions that precede RecentXmin, because by that point, every snapshot in the system should have that committed transaction as no longer in progress anyway. Something likediff --git a/src/backend/commands/async.c b/src/backend/commands/async.c index 8ac7d989641..88d5ed2b461 100644 --- a/src/backend/commands/async.c +++ b/src/backend/commands/async.c @@ -2038,6 +2038,11 @@ asyncQueueProcessPageEntries(volatile QueuePosition *current, } else if (TransactionIdDidCommit(qe->xid)) { + if (TransactionIdPrecedes(RecentXmin, qe->xid)) + { + elog(WARNING, "freezing an entry for transaction %u", qe->xid); + qe->xid = FrozenTransactionId; + } memcpy(local_buf_end, qe, qe->length); local_buf_end += qe->length; } If we do this, then the chances that we need to freeze items at vacuum time are much lower, and we're not creating any danger that TransactionIdDidCommit() errors any more than we have in the current code. Is there any reason this wouldn't work?
We could do that and it would be good for performance anyway. This is the "hint bit" idea that's been discussed in this thread. I think it would be better to do it with additional hint bits rather than by overwriting 'xid'. Having a separate COMMIITTED/ABORTED and FROZEN bits would allow setting the COMMITTED hint bit even when the xid is newer than RecentXmin.
Would need to mark the SLRU page dirty if we do that. Not sure if that requires holding an exclusive lock. And we should also replace aborted xids with InvalidTransactionId.
I feel that that should be a separate patch, and not backpatched. We'd still need all the other changes from this patch series even if we did that, it would be just an optimization.
I noticed that async-notify-test-3 doesn't actually freeze any items by itself ... you need to do "ALTER TABLE template0 allow_connections" and then do vacuumdb -a in a loop in order for this to happen (or something similar, I guess). Otherwise the new code in AsyncNotifyFreezeXids is never executed.
Right, async-notify-test-3 was just a performance test to verify that the holding the SLRU lock longer doesn't degrade performance. I haven't tried performance testing CLOG truncation itself, but it happens so rarely that I'm not too worried about it. See src/test/modules/xid_wraparound/t/004_notify_freeze.pl for a correctness test.
- Heikki
