On 04/04/2025 7:10 pm, Melanie Plageman wrote:
On Fri, Apr 4, 2025 at 1:53 AM Konstantin Knizhnik<knizh...@garret.ru> wrote:
What is needed to reproduce the problem?
1. Table with populated data
2. Presence of transaction with assigned XID which prevents vacuum from
marking pages of this table as all visible
3. Vacuum or autovacuum processed this table (to eliminate dead tuple
and reset number of inserted tuples since last vacuum).
After this 3 steps autovacuum will never be called for this table (at
least until forced vacuum caused by wraparound).
And IOS will not be used or be very inefficient fot this table.
ISTM, this is more an issue of ins_since_vacuum being reset to 0 in
pstat_report_vacuum() even though those inserted tuples weren't
necessarily frozen and their pages not set all-visible. I don't know
exactly how we could modify that logic, but insert-triggered vacuums
are meant to set pages all-visible and freeze tuples, and if they
don't do that, it doesn't quite make sense to zero out the counter
that could trigger another one.
That being said, long-running transactions are a problem for
autovacuum in general. Even if you track this stat you are proposing
about heap fetches by index only scans, you won't know if the long
running transaction is over and thus if it makes sense to try and
trigger an autovacuum for that table again anyway.
- Melanie
From logical point of view I agree with you: taken in account number of
inserted tuples makes sense if it allows to mark page as all-visible.
So `ins_since_vacuum` should be better renamed to
`ins_all_visible_since_vacuum` and count only all-visible tuples. If
newly inserted tuple is not visible to all, then it should not be
accounted in statistic and trigger autovacuum. But I have completely no
idea of how to efficiently maintain such counter: we should keep track
of xids of all recently inserted tuples and on each transaction commit
determine which one of them become all-visible.
And your suggestion just to not reset `ins_since_vacuum` until all of
them becomes all-visible may be easier to implement, but under permanent
workload it can lead to situation when `ins_since_vacuum` is never reset
and at each vacuum iteration cause vacuuming of the table. Which may
cause significant degrade of performance. Even without long-living
transactions.
So I still think that maintaining count of heap visibility check is the
best alternative. It quite easy to implement, adds almost no overhead
and this information indicates efficiency of index-only scan. So it
seems to be useful even if not used by autovacuum.
Yes, long-living transactions and vacuum are "antagonists". If there is
long-living transaction, then forcing autovacuum because of number of
visibility checks criteria can also (as in case of not reseting
`ins_since_vacuum` counter) cause degrade of performance because of too
frequent and useless autovacuum runs for the table. But there is big
difference: using `checks_since_vacuum` criteria we trigger autovacuum
next time only when this counter exceeds threshold. Which should not
happen fast because this counter is reset after each vacuum. Unlike
`ins_since_vacuum` counter which you suggested not to reset until pages
are marked as all-visible by vacuum. In the last case autovacuum will be
invoked for the table each `autovacum_naptime` seconds.