Thanks for the nice list. At Wed, 19 Oct 2022 12:37:30 -0700, Peter Geoghegan <p...@bowt.ie> wrote in > On Wed, Oct 19, 2022 at 11:19 AM Andres Freund <and...@anarazel.de> wrote: > > We e.g. currently can't track the number of blocks written out in a > > relation, > > because we don't have a Relation at that point. Nor can't we really get hold > > of one, as the writeback can happen in a different database without access > > to > > pg_class. Which is also the reason why the per-relation IO stats aren't > > populated by the startup process, even though it'd obviously sometimes be > > helpful to know where the most IO time is spent on a standby. > > > > There's also quite a bit of contortions of the bufmgr interface related to > > this. > > This seems related to the difficulty with distinguishing between > internal pages and leaf pages (or some generalized AM-agnostic > definition) in views like pg_statio_*_indexes. > > Differentiating between leaf pages and internal pages would definitely > be a big improvement, but it's kind of an awkward thing to implement > [1] because you have to somehow invent the general concept of multiple > distinct kinds of buffers/pages within a relation. A lot of code would > need to be taught about that. > > This work would be more likely to actually happen if it was tied to > some bigger project that promised other benefits.
Stickier buffers for index pages seems to be related. I haven't see it even get started, though. But this might be able be an additional reason for starting it. > > 2) Split index and table statistics into different types of stats > > > This e.g. would allow us keep track of the number of index entries killed > > via > > the killtuples mechanism, which in turn would allow us to more intelligently > > decide whether we should vacuum indexes (often the most expensive part of > > vacuum). In a lot of workload killtuples takes care of most of the cleanup, > > but in others it doesn't do much. > > While I do agree that it would be nice to record information about the > number of deletion operations per index, that information will still > be tricky to interpret and act upon relative to other kinds of > information. As a general rule, we should prefer to focus on signals > that show things really aren't going well in some specific and > unambiguous way. Signals about things that are going well seem harder > to work with -- they don't generalize well. I think some statistics can be pure-internal purpose. We can maintain some statistics hidden from users, if we want. (However, I think people will request for the numbers to be revealed, finally..) > What I really mean here is this: I think that page split stuff is > going to be much more interesting than index deletion stuff. Index > deletion exists to prevent page splits. So it's natural to ask > questions about where that seems like it ought to have happened, but > didn't actually happen. This likely requires bucketing page splits > into different categories (since most individual page splits aren't > like that at all). Then it becomes much easier to (say) compare > indexes on the same table -- the user can follow a procedure that is > likely to generalize well to many different kinds of situations. > > It's not completely clear how the bucketization would work. We ought > to remember how many page splits were caused by INSERT statements > rather than non-HOT UPDATEs, though -- that much seems likely to be > very useful and actionable. The DBA can probably consume this > information in a low context way by looking at the proportions of one > kind of split to the other at the level of each index. > > One type of split is mostly just a "cost of doing business" for B-Tree > indexing. The other type really isn't. > > > 3) Maintain more historical statistics about vacuuming > > > However, none of that allows the user to identify which relations are > > causing > > autovacuum to not keep up. Even just keeping track of the the total time > > autovacuum has spent on certain relations would be a significant > > improvement, > > with more easily imaginable (total IO [time], autovacuum delay time, xid > > age). > > With VACUUM in particular the picture over time can be far easier to > work with than any given snapshot, from any single VACUUM operation. > Focusing on how things seem to be changing can make it a lot easier to > spot concerning trends, especially if you're a non-expert. Agreed. It seem like a kind of easy (low-hanging) one. I'll give it a try. There should be some other numbers that timeseries stats are useful. > I would also expect a similar focus on the picture over time to be > useful with the indexing stuff, for roughly the same underlying > reasons. > > [1] > https://postgr.es/m/CAA8Fd-pB=mr42YQuoaLPO_o2=xo9yjnjq23cyjdfwc8sxgm...@mail.gmail.com regards. -- Kyotaro Horiguchi NTT Open Source Software Center