Hi, On Wed, Jul 10, 2024 at 03:02:34PM +0900, Michael Paquier wrote: > On Sat, May 25, 2024 at 07:52:02AM +0000, Bertrand Drouvot wrote: > > But I think that it is in a state that can be used to discuss the approach > > it > > is implementing (so that we can agree or not on it) before moving > > forward. > > I have read through the patch to get an idea of how things are done,
Thanks! > and I am troubled by the approach taken (mentioned down by you), but > that's invasive compared to how pgstats wants to be transparent with > its stats kinds. > > + Oid objoid; /* object ID, either table or function > or tablespace. */ > + RelFileNumber relfile; /* relfilenumber for RelFileLocator. */ > } PgStat_HashKey; > > This adds a relfilenode component to the central hash key used for the > dshash of pgstats, which is something most stats types don't care > about. That's right but that's an existing behavior without the patch as: PGSTAT_KIND_DATABASE does not care care about the objoid PGSTAT_KIND_REPLSLOT does not care care about the dboid PGSTAT_KIND_SUBSCRIPTION does not care care about the dboid That's 3 kinds out of the 5 non fixed stats kind. Not saying it's good, just saying that's an existing behavior. > That looks like the incorrect thing to do to me, particularly > seeing a couple of lines down that a stats kind is assigned so the > HashKey uniqueness is ensured by the KindInfo: > + [PGSTAT_KIND_RELFILENODE] = { > + .name = "relfilenode", You mean, just rely on kind, dboid and relfile to ensure uniqueness? I'm not sure that would work as there is this comment in relfilelocator.h: " * Notice that relNumber is only unique within a database in a particular * tablespace. " So, I think it makes sense to link the hashkey to all the RelFileLocator fields, means: dboid (linked to RelFileLocator's dbOid) objoid (linked to RelFileLocator's spcOid) relfile (linked to RelFileLocator's relNumber) > FWIW, I have on my stack of patches something to switch the objoid to > 8 bytes, actually, which is something that would be required for > pg_stat_statements as query IDs are wider than that and affect all > databases, FWIW. Relfilenodes are 4 bytes, okay still Robert has > proposed a couple of years ago a patch set to bump that to 56 bits, > change reverted in a448e49bcbe4. Right, but it really looks like this extra field is needed to ensure uniqueness (see above). > What you would be looking instead is to use the relfilenode as an > objoid Not sure that works, as it looks like uniqueness won't be ensured (see above). > and keep track of the OID of the original relation in each > PgStat_StatRelFileNodeEntry so as it is possible to know where a past > relfilenode was used? That makes looking back at the past relation's > elfilenodes stats more complicated as it would be necessary to keep a > list of the past relfilenodes for a relation, as well. Perhaps with > some kind of cache that maintains a mapping between the relation and > its relfilenode history? Yeah, I also thought about keeping a list of "previous" relfilenodes stats for a relation but that would lead to: 1. Keep previous relfilnode stats 2. A more complicated way to look at relation stats (as you said) 3. Extra memory usage I think the only reason "previous" relfilenode stats are needed is to provide accurate stats for the relation. Outside of this need, I don't think we would want to retrieve "individual" previous relfilenode stats in the past. That's why the POC patch "simply" copies the stats to the relation during a rewrite (before getting rid of the "previous" relfilenode stats). What do you think? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com