David Gould <da...@sonic.net> writes: > On Sun, 4 Mar 2018 07:49:46 -0800 > Jeff Janes <jeff.ja...@gmail.com> wrote: >> I don't see how it could have caused the problem in the first place. In >> your demonstration case, you had to turn off autovac in order to get it to >> happen, and then when autovac is turned back on, it is all primed for an >> autovac to launch, go through, touch almost all of the pages, and fix it >> for you. How did your original table get into a state where this wouldn't >> happen?
> One more way for this to happen, vacuum was including the dead tuples in the > estimate in addition to the live tuples. FWIW, I've been continuing to think about this and poke at your example, and I am having the same difficulty as Jeff. While it's clear that if you managed to get into a state with wildly inflated reltuples, ANALYZE would fail to get out of it, it's not clear how you could get to such a state without additional contributing factors. This ANALYZE behavior only seems to result in an incremental increase in reltuples per run, and so that shouldn't prevent autovacuum from happening and fixing the estimate --- maybe not as quickly as it should happen, but it'd happen. The reasons I'm harping on this are (a) if there are additional bugs contributing to the problem, we need to identify them and fix them; (b) we need to understand what the triggering conditions are in some detail, so that we can decide whether this bug is bad enough to justify back-patching a behavioral change. I remain concerned that the proposed fix is too simplistic and will have some unforeseen side-effects, so I'd really rather just put it in HEAD and let it go through a beta test cycle before it gets loosed on the world. Another issue I have after thinking more is that we need to consider what should happen during a combined VACUUM+ANALYZE. In this situation, with the proposed patch, we'd overwrite VACUUM's result with an estimate derived from ANALYZE's sample ... even though VACUUM's result might've come from a full-table scan and hence be exact. In the existing code a small ANALYZE sample can't do a lot of damage to VACUUM's result, but that would no longer be true with this. I'm inclined to feel that we should trust VACUUM's result for reltuples more than ANALYZE's, on the grounds that if there actually was any large change in reltuples, VACUUM would have looked at most of the pages and hence have a more reliable number. Hence I think we should skip the pg_class update for ANALYZE if it's in a combined VACUUM+ANALYZE, at least unless ANALYZE looked at all (most of?) the pages. That could be mechanized with something like - if (!inh) + if (!inh && !(options & VACOPT_VACUUM)) controlling do_analyze_rel's call to vac_update_relstats, maybe with a check on scanned_pages vs total_pages. Perhaps the call to pgstat_report_analyze needs to be filtered similarly (or maybe we still want to report that an analyze happened, but somehow tell the stats collector not to change its counts?) Also, as a stylistic matter, I'd be inclined not to touch vac_estimate_reltuples' behavior. The place where the rubber is meeting the road is *totalrows = vac_estimate_reltuples(onerel, true, totalblocks, bs.m, liverows); if (bs.m > 0) *totaldeadrows = floor((deadrows / bs.m) * totalblocks + 0.5); else *totaldeadrows = 0.0; and it seems to me it'd make more sense to abandon the use of vac_estimate_reltuples entirely, and just calculate totalrows in a fashion exactly parallel to totaldeadrows. (I think that's how the code used to look here ...) In HEAD, we could then drop vac_estimate_reltuples' is_analyze argument altogether, though that would be unwise in the back branches (if we back-patch) since we have no idea whether any third party code is calling this function. regards, tom lane