On Wed, Jan 18, 2023 at 3:15 PM Peter Geoghegan <p...@bowt.ie> wrote: > Suppose that we notice that its new > estimate for live_tuples approximately matches what the stats > subsystem already thought about live_tuples, while dead_tuples is far > far lower. We shouldn't be so credulous as to believe the new > dead_tuples estimate at that point. > > Another angle of attack is the PD_ALL_VISIBLE page-level bit, which > acquire_sample_rows() could pay attention to -- especially in larger > tables, where the difference between all pages and just the > all-visible subset of pages is most likely to matter. The more sampled > pages that had PD_ALL_VISIBLE set, the less credible the new > dead_tuples estimate will be (relative to existing information), and > so pgstat_report_analyze() should prefer the new estimate over the old > one in proportion to that.
I don't know enough about the specifics of how this works to have an intelligent opinion about how likely these particular ideas are to work out. However, I think it's risky to look at estimates and try to infer whether they are reliable. It's too easy to be wrong. What we really want to do is anchor our estimates to some data source that we know we can trust absolutely. If you trust possibly-bad data less, it screws up your estimates more slowly, but it still screws them up. If Andres is correct that what really matter is the number of pages we're going to have to dirty, we could abandon counting dead tuples altogether and just count not-all-visible pages in the VM map. That would be cheap enough to recompute periodically. However, it would also be a big behavior change from the way we do things now, so I'm not sure it's a good idea. Still, a quantity that we can be certain we're measuring accurately is better than one we can't measure accurately even if it's a somewhat worse proxy for the thing we really care about. There's a ton of value in not being completely and totally wrong. > FWIW, part of my mental model with VACUUM is that the rules kind of > change in the case of a big table. We're far more vulnerable to issues > such as (say) waiting for cleanup locks because the overall cadence > used by autovacuum is so infrequently relative to everything else. > There are more opportunities for things to go wrong, worse > consequences when they do go wrong, and greater potential for the > problems to compound. Yes. A lot of parts of PostgreSQL, including this one, were developed a long time ago when PostgreSQL databases were a lot smaller than they often are today. -- Robert Haas EDB: http://www.enterprisedb.com