Re: Vacuum statistics

Andrei Zubkov Mon, 13 Jan 2025 00:54:58 -0800

Hi Sami,

Thank you for your attention to our patch and for your own work.

On Sun, 2025-01-05 at 20:00 -0600, Sami Imseih wrote:
> 
> You make valid points. I now think because track_vacuum_statistics is
> optional, we should track total_time in 2 places. First place in the
> new
> view being proposed here and the second place is in
> pg_stat_all_tables
> as being proposed here [3]. This way if track_vacuum_statistics is
> off, the
> total_time of vacuum could still be tracked by pg_stat_all_tables.

I think that field total_time in pg_stat_all_tables is redundant at
least if it will be the only field we want to add there. Yes, we have
vacuum counts in pg_stat_all_tables, but those are not related to the
vacuum workload actually. When we think we see unusual numbers there,
we can answer the question "why" - we know the conditions causing
autovacuum to launch a vacuum on every particular table, we have tuple
statistics on this table, and we can detect anomalies here. For
example, when vacuum process should be launched 5 times, but was
launched only twice.

The total_time field is workload metric. Yes, we can calculate the
mean time of vacuum operation on every particular table but there is
nothing we can do with it. We don't know what this time should be for
this table now. We only can compare this metric to its values in the
past. But once we see this time raising we will immediately face the
question "why?". And we have nothing to say about it. Where the time
was spent: vacuuming heap, vacuuming indexes, sleeping in the delay
point or performing IO operations, is there actual workload performed
by vacuum increased with total_time, or now we are spending more time
for the same workload? I think if we are adding workload statistics to
the Cumulative Statistics System we should do it as complete as
possible.

-- 
Regards, Andrei Zubkov
Postgres Professional

Re: Vacuum statistics

Reply via email to