Re: Avoiding unnecessary clog lookups while freezing

Andres Freund Wed, 28 Dec 2022 16:20:31 -0800

Hi,

On 2022-12-28 15:24:28 -0800, Peter Geoghegan wrote:
> I took another look at the code coverage situation around freezing
> following pushing the page-level freezing patch earlier today. I
> spotted an issue that I'd missed up until now: certain sanity checks
> in heap_prepare_freeze_tuple() call TransactionIdDidCommit() more
> often than really seems necessary.
> 
> Theoretically this is an old issue that dates back to commit
> 699bf7d05c, as opposed to an issue in the page-level freezing patch.
> But that's not really true in a real practical sense. In practice the
> calls to TransactionIdDidCommit() will happen far more frequently
> following today's commit 1de58df4fe (since we're using OldestXmin as
> the cutoff that gates performing freeze_xmin/freeze_xmax processing --
> not FreezeLimit).


Hm. But we still only do the check when we actually freeze, rather than just
during the pre-check in heap_tuple_should_freeze(). So we'll only incur the
increased overhead when we also do more WAL logging etc. Correct?


> I took another look at the code coverage situation around freezing
> I see no reason why we can't just condition the XID sanity check calls
> to TransactionIdDidCommit() (for both the freeze_xmin and the
> freeze_xmax callsites) on a cheap tuple hint bit precheck not being
> enough. We only need an expensive call to TransactionIdDidCommit()
> when the precheck doesn't immediately indicate that the tuple xmin
> looks committed when that's what the sanity check expects to see (or
> that the tuple's xmax looks aborted, in the case of the callsite where
> that's what we expect to see).

Hm. I dimply recall that we had repeated cases where the hint bits were set
wrongly due to some of the multixact related bugs. I think I was trying to be
paranoid about not freezing stuff in those situations, since it can lead to
reviving dead tuples, which obviously is bad.


> Attached patch shows what I mean. A quick run of the standard
> regression tests (with custom instrumentation) shows that this patch
> eliminates 100% of all relevant calls to TransactionIdDidCommit(), for
> both the freeze_xmin and the freeze_xmax callsites.

There's practically no tuple-level concurrent activity in a normal regression
test. So I'm a bit doubtful that is an interesting metric. It'd be a tad more
interesting to run tests, including isolationtester and pgbench, against a
running cluster.

Greetings,

Andres Freund

Re: Avoiding unnecessary clog lookups while freezing

Reply via email to