On Nov 10, 2024, at 2:09 PM, Alena Rybakina <a.rybak...@postgrespro.ru> wrote:

On 08.11.2024 22:34, Jim Nasby wrote:

On Nov 2, 2024, at 7:22 AM, Alena Rybakina <a.rybak...@postgrespro.ru> wrote:

The second is the interrupts field. It is needed for monitoring to know
do we have them or not, so tracking them on the database level will do
the trick. Interrupt is quite rare event, so once the monitoring system
will catch one the DBA can go to the server log for the details.
Just to confirm… by “interrupt” you mean vacuum encountered an error?
Yes it is.
In that case I feel rather strongly that we should label that as “errors”. “Interrupt” could mean a few different things, but “error” is very clear.

I updated patches. I excluded system and user time statistics and save number of interrupts only for database. I removed the ability to get statistics for all tables, now they can only be obtained for an oid table [0], as suggested here. I also renamed the statistics from pg_stat_vacuum_tables to pg_stat_get_vacuum_tables and similarly for indexes and databases. I noticed that that’s what they’re mostly called. Ready for discussion.

I think it’s better that the views follow the existing naming conventions (which don’t include “_get_”; only the functions have that in their names). Assuming that, the only question becomes pg_stat_vacuum_* vs pg_stat_*_vacuum. Given the existing precedent of pg_statio_*, I’m inclined to go with pg_stat_vacuum_*.
I have fixed it.

I’ve reviewed and made some cosmetic changes to patch 1, though of note it looks like an effort has been made to keep stat_reset_timestamp at the end of PgStat_StatDBEntry, so I re-arranged that. I also removed some obviously dead code. It appears that pgstat_update_snapshot(), InitSnapshotIterator() and ScanStatSnapshot() are also dead, but I’ve left it in incase I’m missing something. The tests are also failing for me because a number of psql variables aren’t set.

I do think we should separate out the counts for deleted but still visible tuples vs tuples where we couldn’t get a cleanup lock (in other words, recently_dead_tuples and missed_dead_tuples from LVRelState). I realize that’s a departure from how some of the existing reporting works, but IMO combining them together isn’t a pattern we should be repeating since they mean completely different things. Towards that end I did remove missed_dead_tuples from the reporting, and renamed ExtVacReport.dead_tuples to recently_dead_tuples, but I stopped short of creating a separate entry for missed_dead_tuples. Note that while recently_dead_tuples is really a global thing (so only needs to be reported at a global (or at most per-database) level, but missed_dead_tuples should really be at a per-table level.

Updated 0001-v13 attached, as well as the diff between v12 and v13.

Attachment: vacuum_stats_0001_v12_v13.patch
Description: Binary data

Attachment: vacuum_stats_0001_v13.patch
Description: Binary data


Reply via email to