On Thu, Dec 12, 2024 at 02:02:38PM +0000, Bertrand Drouvot wrote: > On Thu, Dec 12, 2024 at 01:52:03PM +0900, Michael Paquier wrote: > Yeah, but it's needed in pg_stat_get_backend_io() for the stats filtering (to > display only those linked to this backend type), later in the function here: > > + /* > + * Some combinations of BackendType, IOObject, and IOContext are > + * not valid for any type of IOOp. In such cases, omit the entire > + * row from the view. > + */ > + if (!pgstat_tracks_io_object(bktype, io_obj, io_context)) > + continue; > > OTOH, now that we get rid of the AuxiliaryPidGetProc() call in > pg_stat_reset_single_backend_io_counters() we can also remove the need to > look for the backend type in this function.
Ah, you need that for the pgstat_tracks_io_object() filtering. I'm wondering if we should think about making that cheaper at some point. As a O(N^2) when coupling a call of pg_stat_get_backend_io() with a scan of pg_stat_activity, perhaps it does not matter much anyway.. Anyway, isn't it possible that this lookup loop finishes by finding nothing depending on concurrent updates of other beentries? It sounds to me that this warrants an early exit in the function. > BTW, not related to this particular patch but I realized that > pgstat_flush_io() > is called for the walwriter. Indeed, it's coming from the pgstat_report_wal() > call in WalWriterMain(). That can not report any I/O stats activity (as the > walwriter is not part of the I/O stats tracking, see > pgstat_tracks_io_bktype()). > > So it looks like, we could move pgstat_flush_io() outside of > pgstat_report_wal() > and add the pgstat_flush_io() calls only where they need to be made (and so, > not > in WalWriterMain()). Perhaps, yes. pgstat_tracks_io_bktype() has always been discarded walwriters since pgstat_io.c exists. > But I'm not sure that's worth it until we don't have a need to add more > pending stats per-backend, thoughts? Okay with your argument here. >> +{ oid => '8806', descr => 'statistics: per backend IO statistics', >> + proname => 'pg_stat_get_backend_io', prorows => '5', proretset => 't', >> >> Similarly, s/pg_stat_get_backend_io/pg_stat_get_backend_stats/? > > I think that's fine to keep pg_stat_get_backend_io(). If we add more > per-backend > stats in the future then we could add "dedicated" get functions too and a > generic > one retrieving all of them. Okay about this layer, discarding my remark. You have a point about that: other stats associated to a single backend may be OK if not returned as a SRF, and they'll most likely return a different set of attributes. >> + descr => 'statistics: reset collected IO statistics for a single backend', >> + proname => 'pg_stat_reset_single_backend_io_counters', provolatile => >> 'v', >> >> And here, pg_stat_reset_backend_stats? > > Same as above, we could imagine that in the future the backend would get > mutiple > stats and that one would want to reset only the I/O ones for example. Disagreed about this part. It is slightly simpler to do a full reset of the stats in a single entry. If another subset of stats is added to the backend-level entries, we could always introduce a new function that has more control over what subset of a single backend entry is reset. And I'm pretty sure that we are going to need the function that does the full reset anyway. >> I >> would just add a note in the docs with a query showing how to use it >> with pg_stat_activity. An example with LATERAL, doing the same work: >> select a.pid, s.* from pg_stat_activity as a, >> lateral pg_stat_get_backend_io(a.pid) as s >> where pid = pg_backend_pid(); > > I'm not sure it's worth it. I think that's clear that to get our own stats > then we need to provide our own backend pid. For example > pg_stat_get_activity() > does not provide such an example using pg_stat_activity or using something > like > pg_stat_get_activity(pg_backend_pid()). Okay. As far as I can see, the patch relies entirely on write_to_file to prevent any entries to be flushed out. It means that we leave in the dshash entries that may sit idle for as long as the server is up once a pgproc slot is used at least once. This scales depending on max_connections. It also means that we skip the sanity check about dropped entries at shutdown, which may be a good thing to do because we don't need to loop through them when writing the stats file. Hmm. Could it be better to be more aggressive with the handling of these stats, marking them as dropped when their backend exists and cleanup the dshash, without relying on the write flag to make sure that all the entries are discarded at shutdown? The point is that we do shutdown in a controlled manner, with all backends exiting before the checkpointer writes the stats file after the shutdown checkpoint is completed. The patch handles things so as entries are reset when a procnum is reused, leaving past stats around until that happens. We should perhaps aim for more consistency with the way beentry is refreshed and be more proactive with the backend entry drop or reset at backend shutdown (pgstat_beshutdown_hook?), so as what is in the dshash reflects exactly what's in shared memory for each PGPROC and beentry. Not sure that the "_per_" added in the various references of the patch are good to keep, like pgstat_tracks_per_backend_bktype. These could be removed, I guess, doing also a PGSTAT_KIND_PER_BACKEND => PGSTAT_KIND_BACKEND? -- Michael
signature.asc
Description: PGP signature