On Thu, 2020-03-19 at 23:20 -0700, Andres Freund wrote: > I am not sure about b). In my mind, the objective is not to prevent > > anti-wraparound vacuums, but to see that they have less work to do, > > because previous autovacuum runs already have frozen anything older than > > vacuum_freeze_min_age. So, assuming linear growth, the number of tuples > > to freeze during any run would be at most one fourth of today's number > > when we hit autovacuum_freeze_max_age. > > Based on two IM conversations I think it might be worth emphasizing how > vacuum_cleanup_index_scale_factor works: > > For btree, even if there is not a single deleted tuple, we can *still* > end up doing a full index scans at the end of vacuum. As the docs describe > vacuum_cleanup_index_scale_factor: > > <para> > Specifies the fraction of the total number of heap tuples counted in > the previous statistics collection that can be inserted without > incurring an index scan at the <command>VACUUM</command> cleanup > stage. > This setting currently applies to B-tree indexes only. > </para> > > I.e. with the default settings we will perform a whole-index scan > (without visibility map or such) after every 10% growth of the > table. Which means that, even if the visibility map prevents repeated > tables accesses, increasing the rate of vacuuming for insert-only tables > can cause a lot more whole index scans. Which means that vacuuming an > insert-only workload frequently *will* increase the total amount of IO, > even if there is not a single dead tuple. Rather than just spreading the > same amount of IO over more vacuums. > > And both gin and gist just always do a full index scan, regardless of > vacuum_cleanup_index_scale_factor (either during a bulk delete, or > during the cleanup). Thus more frequent vacuuming for insert-only > tables can cause a *lot* of pain (even an approx quadratic increase of > IO? O(increased_frequency * peak_index_size)?) if you have large > indexes - which is very common for gin/gist.
Ok, ok. Thanks for the explanation. In the light of that, I agree that we should increase the scale_factor. Yours, Laurenz Albe