Hello hackers, I'd like to propose adding a new view named "pg_stat_multixact" to expose multixact member usage. This addresses a major monitoring gap that ultimately led to a production outage at Metronome [1].
Problem Multixact membership exhaustion is an edge case that can cause write lockouts, but there's no visibility into membership space usage. Without any direct telemetry from the database, we're essentially flying blind. It is possible to estimate multixact membership usage through scanning the filesystem, but there are several drawbacks to that method that Naga Appani outlined in a previous thread [2]. This complements Peter Geoghegan's recent thread about vacuum failsafe improvements [3], where Sami Imseih noted "exposing the members count... will be a good idea as well" [4]. Solution - New view (pg_stat_multixact) with the columns "members" (bigint) and "update_timestamp" (timestamptz). - Updates member count and timestamp during multixact allocation and freeze threshold checks. I've attached a patch that: - Implements this view using pgstat patterns. - Includes isolation tests. - Includes documentation changes to monitoring.sgml. I have also: - Tested initdb works - Ran make check-world with --enable-tap-tests to ensure all tests pass I'm aiming to get this into the upcoming CommitFest. I would appreciate your thoughts on this proposal and attached patch. [1] https://metronome.com/blog/root-cause-analysis-postgresql-multixact-member-exhaustion-incidents-may-2025 [2] https://www.postgresql.org/message-id/flat/caldsspi3gh08ntccn44uveuaygot74su6uei_06quta5rmk...@mail.gmail.com#bfd9ae766ef42f7599258183aa8ddb3b [3] https://www.postgresql.org/message-id/cah2-wzmlpwjk3gbaxy8dhy+a-juz_6ugwfe6dke8b5-dtdv...@mail.gmail.com [4] https://www.postgresql.org/message-id/CAA5RZ0u43s4YbR%3D0mJ0_k3VGWjchJHhYnCoaZVzeLd3ccZtwhQ%40mail.gmail.com -- Respectfully, Andrew Johnson Software Engineer Metronome, Inc.
v1-0001-Adding-pg_stat_muiltixact-view-to-allow-membershi.patch
Description: Binary data