Le 25/04/2024 à 21:21, Nathan Bossart a écrit :
On Thu, Apr 25, 2024 at 02:33:05PM -0400, Robert Haas wrote:
What does surprise me is that Frédéric suggests a default value of
500,000. If half a million tuples (proposed default) is 20% of your
table (default value of autovacuum_vacuum_scale_factor) then your
table has 2.5 million tuples. Unless those tuples are very wide, that
table isn't even 1GB in size. I'm not aware that there's any problem
at all with the current formula on a table of that size, or even ten
times that size. I think you need to have tables that are hundreds of
gigabytes in size at least before this starts to become a serious
problem. Looking at this from another angle, in existing releases, the
maximum usable amount of autovacuum_work_mem is 1GB, which means we
can store one-sixth of a billion dead TIDs, or roughly 166 million.
And that limit has been a source of occasional complaints for years.
So we have those complaints on the one hand, suggesting that 166
million is not enough, and then we have this proposal, saying that
more than half a million is too much. That's really strange; my
initial hunch is that the value should be 100-500x higher than what
Frédéric proposed.

Agreed, the default should probably be on the order of 100-200M minimum.


I'm not sure... 500000 comes from the table given in a previous message. It may not be large enough. But vacuum also updates the visibility map, and a few hundred thousand heap fetches can already hurt the performance of an index-only scan, even if most of the blocs are read from cache.

The original proposal also seems to introduce one parameter that would
affect all three of autovacuum_vacuum_threshold,
autovacuum_vacuum_insert_threshold, and autovacuum_analyze_threshold.  Is
that okay?  Or do we need to introduce a "limit" GUC for each?  I guess the
question is whether we anticipate any need to have different values for
these limits, which might be unlikely.


I agree with you, it seems unlikely. This is also an answer to Melanie's question about the name of the GUC : I deliberately left out the other "vacuum" because I thought we only needed one parameter for these three thresholds.

Now I have just read Robert's new message, and I understand his point. But is there a real problem with triggering analyze after every 500000 (or more) modifications in the table anyway?


Reply via email to