Re: Flush pgstats file during checkpoints

Michael Paquier Mon, 02 Sep 2024 18:09:10 -0700

On Mon, Sep 02, 2024 at 05:08:03PM +0300, Heikki Linnakangas wrote:
> Hmm, I'm a bit disappointed this doesn't address replication. It makes sense
> that scans are counted separately on a standby, but it would be nice if
> stats like last_vacuum were propagated from primary to standbys.  I guess
> that can be handled separately later.


Yes, it's not something that I'm planning to tackle for this thread.
Speaking of which, the design that I got in mind for this area was not
"that" complicated:
- Add a new RMGR for all the stats.
- Add a first callback for stats kinds for WAL inserts, giving to each
stats the possibility to pass down data inserted to the record, as we
want to replicate a portion of the data depending on the kind dealt
with.
- Add a second callback for recovery, called depending on the kind ID.

I have not looked into the details yet, but stats to replicate should
be grouped in a single record on transaction commit or depending on
the flush timing for fixed-numbered stats.  Or we should just add them
in commit records?

> Reviewing v7-0001-Flush-pgstats-file-during-checkpoints.patch:
> 
> There are various race conditions where a stats entry can be leaked in the
> pgstats file. I.e. relation is dropped, but its stats entry is retained in
> the stats file after crash. In the worst case, suck leaked entries can
> accumulate until the stats file is manually removed, which resets all stats
> again. Perhaps that's acceptable - it's still possible leak the actual
> relation file for a new relation on crash, after all, which is much worse
> (I'm glad Horiguchi-san is working on that [1]).

Yeah, that's not an easy issue.  We don't really have a protection
regarding that as well now.  Backends can also refer to stats entries
in their shutdown callback that have been dropped concurrently.  See
some details about that at https://commitfest.postgresql.org/49/5045/.

> Until 5891c7a8ed, there was a mechanism to garbage collect such orphaned
> entries (pgstat_vacuum()). How bad would it be to re-introduce that? Or can
> we make it more watertight so that there are no leaks?

Not sure about this part, TBH.  Doing that again in autovacuum does
not excite me much as it has a cost.

> I think it's failing to flush the stats file at the end of recovery
> checkpoint.

Missed that, oops.  I'll double-check this area.
--
Michael

signature.asc
Description: PGP signature

Re: Flush pgstats file during checkpoints

Reply via email to