Re: [PATCH v1] Add pg_stat_multixact view for multixact membership usage monitoring

Naga Appani Tue, 10 Jun 2025 08:48:01 -0700

On Tue, Jun 10, 2025 at 9:40 AM Andrew Johnson <[email protected]> wrote:
>
> Hello hackers,
>
> I'd like to propose adding a new view named "pg_stat_multixact" to
> expose multixact member usage. This addresses a major monitoring gap
> that ultimately led to a production outage at Metronome [1].
>
> Problem
> Multixact membership exhaustion is an edge case that can cause write
> lockouts, but there's no visibility into membership space usage.
> Without any direct telemetry from the database, we're essentially
> flying blind. It is possible to estimate multixact membership usage
> through scanning the filesystem, but there are several drawbacks to
> that method that Naga Appani outlined in a previous thread [2].
>
> This complements Peter Geoghegan's recent thread about vacuum failsafe
> improvements [3], where Sami Imseih noted "exposing the members
> count... will be a good idea as well" [4].
>
> Solution
> - New view (pg_stat_multixact) with the columns "members" (bigint) and
> "update_timestamp" (timestamptz).
> - Updates member count and timestamp during multixact allocation and
> freeze threshold checks.
>
> I've attached a patch that:
> - Implements this view using pgstat patterns.
> - Includes isolation tests.
> - Includes documentation changes to monitoring.sgml.
>


Hi Andrew,

Thanks for referencing my earlier proposal and for working to improve
observability around MultiXact usage, it’s great to see more attention
on this area.

After quickly reviewing your patch, I wanted to share a few thoughts
on the overall approach.

I shared a patch [0] that adds a SQL-callable function exposing the
same counters via ReadMultiXactCounts() without complexity. Since
these values are global, not aggregatable per backend or over time,
and not meaningfully resettable, introducing new statistics
infrastructure may be more than what’s needed unless there's an
additional use case I’m overlooking.

A lightweight function seems better aligned with the nature of these
metrics and the operational use cases they serve, particularly for
historical/ongoing diagnostics and periodic monitoring.

[0] 
https://www.postgresql.org/message-id/CA%2BQeY%2BDTggHskCXOa39nag2sFds9BD-7k__zPbvL-_VVyJw7Sg%40mail.gmail.com

Best regards,
Naga Appani


> I have also:
> - Tested initdb works
> - Ran make check-world with --enable-tap-tests to ensure all tests pass
>
> I'm aiming to get this into the upcoming CommitFest. I would
> appreciate your thoughts on this proposal and attached patch.
>
> [1] 
> https://metronome.com/blog/root-cause-analysis-postgresql-multixact-member-exhaustion-incidents-may-2025
> [2] 
> https://www.postgresql.org/message-id/flat/caldsspi3gh08ntccn44uveuaygot74su6uei_06quta5rmk...@mail.gmail.com#bfd9ae766ef42f7599258183aa8ddb3b
> [3] 
> https://www.postgresql.org/message-id/cah2-wzmlpwjk3gbaxy8dhy+a-juz_6ugwfe6dke8b5-dtdv...@mail.gmail.com
> [4] 
> https://www.postgresql.org/message-id/CAA5RZ0u43s4YbR%3D0mJ0_k3VGWjchJHhYnCoaZVzeLd3ccZtwhQ%40mail.gmail.com
>
> --
> Respectfully,
>
> Andrew Johnson
> Software Engineer
> Metronome, Inc.

Re: [PATCH v1] Add pg_stat_multixact view for multixact membership usage monitoring

Reply via email to