Hi hackers, I would like to propose a patch that enhances the pg_stat_progress_vacuum view by adding a mode column. The patch is attached.
Although it is possible to identify an anti-wraparound VACUUM through the process title (to prevent wraparound) or specific log entries, it would be significantly more convenient for monitoring purposes to have this status clearly indicated in the pg_stat_progress_vacuum view. This would enable DBAs to immediately understand the urgency of the vacuum process without needing to check separate logs or system processes. This patch introduces a mode column to provide this visibility. The possible values are: - normal: A standard, user-initiated VACUUM or a regular autovacuum run. - anti-wraparound: An autovacuum run launched specifically to prevent transaction ID wraparound. - failsafe: A vacuum that has entered failsafe mode to prevent imminent transaction ID wraparound. This will allow administrators to better understand the context and urgency of vacuum operations, which is crucial for monitoring and troubleshooting. Design Considerations: When defining the scope of the anti-wraparound mode, I considered including manual commands like VACUUM (FREEZE) or VACUUM (DISABLE_PAGE_SKIPPING). However, I decided against this to keep the meaning of the mode clear and simple. These options can be used for various purposes, and overloading the anti-wraparound mode with too many meanings could become confusing. Therefore, the current implementation limits this mode to autovacuum runs that are explicitly launched for wraparound prevention. Regarding Testing: I was able to manually verify the failsafe mode's behavior by using the existing test script at src/test/modules/xid_wraparound/t/001_emergency_vacuum.pl. This script successfully triggered the failsafe condition and the view reported the correct mode. However, I found this test to be somewhat flaky in my environment and decided not to add it to the patch to avoid introducing a potentially unstable test into the tree. Thought? -- Best regards, Shinya Kato NTT OSS Center
From e06ac211652a9706d5e6511083f6777260e041b5 Mon Sep 17 00:00:00 2001 From: Shinya Kato <shinya11.kato@gmail.com> Date: Thu, 14 Aug 2025 15:28:39 +0900 Subject: [PATCH v1] Add mode to pg_stat_progress_vacuum This commit enhances the pg_stat_progress_vacuum view by adding a mode column, which provides visibility into the nature of the running vacuum process. The mode can be one of the following: - normal: A standard, user-initiated VACUUM or a regular autovacuum run. - anti-wraparound: An autovacuum run for anti-wraparound purposes. - failsafe: A failsafe vacuum activated to prevent transaction ID wraparound. This allows administrators to better understand the context and urgency of vacuum operations. --- doc/src/sgml/monitoring.sgml | 32 ++++++++++++++++++++++++++++ src/backend/access/heap/vacuumlazy.c | 13 +++++++---- src/backend/catalog/system_views.sql | 6 +++++- src/include/commands/progress.h | 6 ++++++ src/test/regress/expected/rules.out | 8 ++++++- 5 files changed, 59 insertions(+), 6 deletions(-) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 3f4a27a736e..3ca994f73bb 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -6607,6 +6607,38 @@ FROM pg_stat_get_backend_idset() AS backendid; stale. </para></entry> </row> + + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>mode</structfield> <type>text</type> + </para> + <para> + The mode of the current vacuum operation. Possible values are: + <itemizedlist> + <listitem> + <para> + <literal>normal</literal>: A standard vacuum operation, including + user-initiated <command>VACUUM</command> command and autovacuum runs + not triggered for anti-wraparound purposes. + </para> + </listitem> + <listitem> + <para> + <literal>anti-wraparound</literal>: An autovacuum run specifically to + prevent transaction ID or multixact ID wraparound. + </para> + </listitem> + <listitem> + <para> + <literal>failsafe</literal>: A vacuum operation that entered failsafe + mode when the system was at risk of transaction ID or multixact ID + wraparound (see <xref linkend="guc-vacuum-failsafe-age"/> and + <xref linkend="guc-vacuum-multixact-failsafe-age"/>). + </para> + </listitem> + </itemizedlist> + </para></entry> + </row> </tbody> </tgroup> </table> diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c index 14036c27e87..f77dbed7899 100644 --- a/src/backend/access/heap/vacuumlazy.c +++ b/src/backend/access/heap/vacuumlazy.c @@ -652,6 +652,10 @@ heap_vacuum_rel(Relation rel, const VacuumParams params, pgstat_progress_start_command(PROGRESS_COMMAND_VACUUM, RelationGetRelid(rel)); + pgstat_progress_update_param(PROGRESS_VACUUM_MODE, + params.is_wraparound + ? PROGRESS_VACUUM_MODE_ANTI_WRAPAROUND + : PROGRESS_VACUUM_MODE_NORMAL); /* * Setup error traceback support for ereport() first. The idea is to set @@ -2957,9 +2961,10 @@ lazy_check_wraparound_failsafe(LVRelState *vacrel) { const int progress_index[] = { PROGRESS_VACUUM_INDEXES_TOTAL, - PROGRESS_VACUUM_INDEXES_PROCESSED + PROGRESS_VACUUM_INDEXES_PROCESSED, + PROGRESS_VACUUM_MODE }; - int64 progress_val[2] = {0, 0}; + int64 progress_val[3] = {0, 0, PROGRESS_VACUUM_MODE_FAILSAFE}; VacuumFailsafeActive = true; @@ -2975,8 +2980,8 @@ lazy_check_wraparound_failsafe(LVRelState *vacrel) vacrel->do_index_cleanup = false; vacrel->do_rel_truncate = false; - /* Reset the progress counters */ - pgstat_progress_update_multi_param(2, progress_index, progress_val); + /* Reset the progress counters and the mode */ + pgstat_progress_update_multi_param(3, progress_index, progress_val); ereport(WARNING, (errmsg("bypassing nonessential maintenance of table \"%s.%s.%s\" as a failsafe after %d index scans", diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 1b3c5a55882..685086edd8f 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -1248,7 +1248,11 @@ CREATE VIEW pg_stat_progress_vacuum AS S.param6 AS max_dead_tuple_bytes, S.param7 AS dead_tuple_bytes, S.param8 AS num_dead_item_ids, S.param9 AS indexes_total, S.param10 AS indexes_processed, - S.param11 / 1000000::double precision AS delay_time + S.param11 / 1000000::double precision AS delay_time, + CASE S.param12 WHEN 1 THEN 'normal' + WHEN 2 THEN 'anti-wraparound' + WHEN 3 THEN 'failsafe' + END AS mode FROM pg_stat_get_progress_info('VACUUM') AS S LEFT JOIN pg_database D ON S.datid = D.oid; diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h index 1cde4bd9bcf..73f267a002f 100644 --- a/src/include/commands/progress.h +++ b/src/include/commands/progress.h @@ -29,6 +29,7 @@ #define PROGRESS_VACUUM_INDEXES_TOTAL 8 #define PROGRESS_VACUUM_INDEXES_PROCESSED 9 #define PROGRESS_VACUUM_DELAY_TIME 10 +#define PROGRESS_VACUUM_MODE 11 /* Phases of vacuum (as advertised via PROGRESS_VACUUM_PHASE) */ #define PROGRESS_VACUUM_PHASE_SCAN_HEAP 1 @@ -38,6 +39,11 @@ #define PROGRESS_VACUUM_PHASE_TRUNCATE 5 #define PROGRESS_VACUUM_PHASE_FINAL_CLEANUP 6 +/* Modes of vacuum (as advertised via PROGRESS_VACUUM_MODE) */ +#define PROGRESS_VACUUM_MODE_NORMAL 1 +#define PROGRESS_VACUUM_MODE_ANTI_WRAPAROUND 2 +#define PROGRESS_VACUUM_MODE_FAILSAFE 3 + /* Progress parameters for analyze */ #define PROGRESS_ANALYZE_PHASE 0 #define PROGRESS_ANALYZE_BLOCKS_TOTAL 1 diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out index 35e8aad7701..60b5d87a130 100644 --- a/src/test/regress/expected/rules.out +++ b/src/test/regress/expected/rules.out @@ -2094,7 +2094,13 @@ pg_stat_progress_vacuum| SELECT s.pid, s.param8 AS num_dead_item_ids, s.param9 AS indexes_total, s.param10 AS indexes_processed, - ((s.param11)::double precision / (1000000)::double precision) AS delay_time + ((s.param11)::double precision / (1000000)::double precision) AS delay_time, + CASE s.param12 + WHEN 1 THEN 'normal'::text + WHEN 2 THEN 'anti-wraparound'::text + WHEN 3 THEN 'failsafe'::text + ELSE NULL::text + END AS mode FROM (pg_stat_get_progress_info('VACUUM'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20) LEFT JOIN pg_database d ON ((s.datid = d.oid))); pg_stat_recovery_prefetch| SELECT stats_reset, -- 2.47.3