Re: New criteria for autovacuum

Konstantin Knizhnik Fri, 04 Apr 2025 12:28:05 -0700


On 04/04/2025 7:10 pm, Melanie Plageman wrote:

On Fri, Apr 4, 2025 at 1:53 AM Konstantin Knizhnik<[email protected]> wrote:

What is needed to reproduce the problem?
1. Table with populated data
2. Presence of transaction with assigned XID which prevents vacuum from
marking pages of this table as all visible
3. Vacuum or autovacuum processed this table (to eliminate dead tuple
and reset number of inserted tuples since last vacuum).


After this 3 steps autovacuum will never be called for this table (at
least until forced vacuum caused by wraparound).
And IOS will not be used or be very inefficient fot this table.

ISTM, this is more an issue of ins_since_vacuum being reset to 0 in
pstat_report_vacuum() even though those inserted tuples weren't
necessarily frozen and their pages not set all-visible. I don't know
exactly how we could modify that logic, but insert-triggered vacuums
are meant to set pages all-visible and freeze tuples, and if they
don't do that, it doesn't quite make sense to zero out the counter
that could trigger another one.

That being said, long-running transactions are a problem for
autovacuum in general. Even if you track this stat you are proposing
about heap fetches by index only scans, you won't know if the long
running transaction is over and thus if it makes sense to try and
trigger an autovacuum for that table again anyway.

- Melanie

From logical point of view I agree with you: taken in account number ofinserted tuples makes sense if it allows to mark page as all-visible.So `ins_since_vacuum` should be better renamed to`ins_all_visible_since_vacuum` and count only all-visible tuples. Ifnewly inserted tuple is not visible to all, then it should not beaccounted in statistic and trigger autovacuum. But I have completely noidea of how to efficiently maintain such counter: we should keep trackof xids of all recently inserted tuples and on each transaction commitdetermine which one of them become all-visible.

And your suggestion just to not reset `ins_since_vacuum` until all ofthem becomes all-visible may be easier to implement, but under permanentworkload it can lead to situation when `ins_since_vacuum` is never resetand at each vacuum iteration cause vacuuming of the table. Which maycause significant degrade of performance. Even without long-livingtransactions.

So I still think that maintaining count of heap visibility check is thebest alternative. It quite easy to implement, adds almost no overheadand this information indicates efficiency of index-only scan. So itseems to be useful even if not used by autovacuum.

Yes, long-living transactions and vacuum are "antagonists". If there islong-living transaction, then forcing autovacuum because of number ofvisibility checks criteria can also (as in case of not reseting`ins_since_vacuum` counter) cause degrade of performance because of toofrequent and useless autovacuum runs for the table. But there is bigdifference: using `checks_since_vacuum` criteria we trigger autovacuumnext time only when this counter exceeds threshold. Which should nothappen fast because this counter is reset after each vacuum. Unlike`ins_since_vacuum` counter which you suggested not to reset until pagesare marked as all-visible by vacuum. In the last case autovacuum will beinvoked for the table each `autovacum_naptime` seconds.

Re: New criteria for autovacuum

Reply via email to