On Fri, Jun 11, 2021 at 02:46:20PM -0700, Peter Geoghegan wrote: > On Thu, Jun 3, 2021 at 11:15 PM Michael Paquier <mich...@paquier.xyz> wrote: >> I have read through the patch, and I am surprised to see that this >> only makes possible to control the optimization at relation level. >> The origin of the complaints is that this index cleanup optimization >> has been introduced as a new rule that gets enforced at *system* >> level, so I think that we should have an equivalent with a GUC to >> control the behavior for the whole system. > > *Why* does it have to work at the system level? I don't understand > what you mean about the system level.
I mean that you lack a GUC that allows to enforce to *not* use this optimization for all relations, for all processes. > As Masahiko pointed out, adding a GUC isn't what we've done in other > similar cases -- that's how DISABLE_PAGE_SKIPPING works, which was a > defensive option that seems similar enough to what we want to add now. > To give another example, the TRUNCATE VACUUM option (or the related > reloption) can be used to disable relation truncation, a behavior that > sometimes causes *big* issues in production. The truncate behavior is > determined dynamically in most situations -- which is another > similarity to the optimization we've added here. > Why is this fundamentally different to those two things? Because the situation looks completely different to me here. TRUNCATE is thought as a option to be able to avoid an exclusive lock when truncating the relation file size at the end of VACUUM. More importantly the default of TRUNCATE is *false*, meaning that we are never going to skip the truncation unless one specifies it at the relation level. Here, what we have is a decision that is enforced to happen by default, all the time, with the user not knowing about it. If there is a bug of an issue with it, users, based on your proposal, would be forced to change it for each *relation*. If they miss some of those relations, they may still run into problems without knowing about it. The change of default behavior and having no way to control it in a simple way look incompatible to me. >> With what you are >> presenting here, one could only disable the optimization for each >> relation, one-by-one. If this optimization proves to be a problem, >> it's just going to be harder to users to go through all the relations >> and re-tune autovacuum. Am I missing something? > > Why would you expect autovacuum to run even when the optimization is > unavailable (e.g. with Postgres 13)? After all, the specifics of when > the bypass optimization kicks in make it very unlikely that ANALYZE > will ever be able to notice enough dead tuples to trigger an > autovacuum (barring antiwraparound and insert-driven autovacuums). > > There will probably be very few LP_DEAD items remaining. Occasionally > there will be somewhat more LP_DEAD items, that happen to be > concentrated in less than 2% of the table's blocks -- but block-based > sampling by ANALYZE is likely to miss most of them and underestimate > the total number. The sampling approach taken by acquire_sample_rows() > ensures this with larger tables. With small tables the chances of the > optimization kicking in are very low, unless perhaps fillfactor has > been tuned very aggressively. > > There has never been a guarantee that autovacuum will be triggered > (and do index vacuuming) in cases that have very few LP_DEAD items, no > matter how the system has been tuned. The main reason why making the > optimization behavior controllable is for the VACUUM command. > Principally for hackers. I can imagine myself using the VACUUM option > to disable the optimization when I was interested in testing VACUUM or > space utilization in some specific, narrow way. > > Of course it's true that there is still some uncertainty about the > optimization harming production workloads -- that is to be expected > with an enhancement like this one. But there is still no actual > example or test case that shows the optimization doing the wrong > thing, or anything like it. Anything is possible, but I am not > expecting there to be even one user complaint about the feature. > Naturally I don't want to add something as heavyweight as a GUC, given > all that. Perhaps. What I am really scared about is that you are assuming that enforcing this decision will be *always* fine. What I am trying to say here is that it *may not* be fine for everybody, and that there should be an easy way to turn it off if that proves to be a problem. I don't quite see how that's an implementation problem, we have already many reloptions that are controlled with GUCs if the reloptions have no default. I think that a more careful choice implementation would have been to turn this optimization off by default, while having an option to allow one to turn it on at will. -- Michael
signature.asc
Description: PGP signature