On Fri, Oct 30, 2020 at 6:46 PM Tomas Vondra <tomas.von...@2ndquadrant.com> wrote: > Yeah. The behavior is technically correct, but it's not very useful for > practical purposes. And most people don't even realize it behaves like > this :-( It's possible to compensate for this effect and estimate the > actually "interesting" hit rate, but if we could have it directly that > would be great.
It's important that the information we provide in system views (and other instrumentation) reflect reality, even when the underlying mechanisms are not well understood by most users. DBAs often observe correlations and arrive at useful conclusions without truly understanding what's happening. Individual hackers have occasionally expressed skepticism of exposing the internals of the system through instrumentation; they object on the grounds that users are unlikely to understand what they see anyway. It seems to me that this completely misses the point. You don't necessarily have to truly understand what's going on to have mechanical sympathy for the system. You don't need to be a physicist to do folk physics. To my mind the best example of this is wait events, which first appeared in proprietary database systems. Wait events expose information about mechanisms that couldn't possibly be fully understood by the end consumer. Because technically the details were trade secrets. That didn't stop them from being very useful in practice. > It seems to me this should not be a particularly difficult patch in > principle, so suitable for new contributors. The main challenge would be > passing information about what page we're dealing with (internal/leaf) > to the place actually calling pgstat_count_buffer_(read|hit). That > happens in ReadBufferExtended, which just has no idea what page it's > dealing with. Not sure how to do that cleanly ... It would be a bit messy to pass down a flag like that, but it could be done. I think the idea of generalized definitions of internal pages and leaf pages ("metadata pages and record pages") could work well, but would require a little thought in some cases. I'm thinking of GIN. I doubt it would really matter what the final determination is about (say) which particular generalized page bucket GIN pending list pages get placed in. It will be a little arbitrary in a few corner cases, but it hardly matters at all. Right now we have something that's technically correct but also practically useless. -- Peter Geoghegan