On Wed, 11 Mar 2020 at 04:17, Laurenz Albe <laurenz.a...@cybertec.at> wrote: > > On Tue, 2020-03-10 at 18:14 +0900, Masahiko Sawada wrote: > > Thanks for the review and your thoughts! > > > FYI actually vacuum could perform index cleanup phase (i.g. > > PROGRESS_VACUUM_PHASE_INDEX_CLEANUP phase) on a table even if it's a > > truly INSERT-only table, depending on > > vacuum_cleanup_index_scale_factor. Anyway, I also agree with not > > disabling index cleanup in insert-only vacuum case, because it could > > become not only a cause of index bloat but also a big performance > > issue. For example, if autovacuum on a table always run without index > > cleanup, gin index on that table will accumulate insertion tuples in > > its pending list and will be cleaned up by a backend process while > > inserting new tuple, not by a autovacuum process. We can disable index > > vacuum by index_cleanup storage parameter per tables, so it would be > > better to defer these settings to users. > > Thanks for the confirmation. > > > I have one question about this patch from architectural perspective: > > have you considered to use autovacuum_vacuum_threshold and > > autovacuum_vacuum_scale_factor also for this purpose? That is, we > > compare the threshold computed by these values to not only the number > > of dead tuples but also the number of inserted tuples. If the number > > of dead tuples exceeds the threshold, we trigger autovacuum as usual. > > On the other hand if the number of inserted tuples exceeds, we trigger > > autovacuum with vacuum_freeze_min_age = 0. I'm concerned that how user > > consider the settings of newly added two parameters. We will have in > > total 4 parameters. Amit also was concerned about that[1]. > > > > I think this idea also works fine. In insert-only table case, since > > only the number of inserted tuples gets increased, only one threshold > > (that is, threshold computed by autovacuum_vacuum_threshold and > > autovacuum_vacuum_scale_factor) is enough to trigger autovacuum. And > > in mostly-insert table case, in the first place, we can trigger > > autovacuum even in current PostgreSQL, since we have some dead tuples. > > But if we want to trigger autovacuum more frequently by the number of > > newly inserted tuples, we can set that threshold lower while > > considering only the number of inserted tuples. > > I am torn. > > On the one hand it would be wonderful not to have to add yet more GUCs > to the already complicated autovacuum configuration. It already confuses > too many users. > > On the other hand that will lead to unnecessary vacuums for small > tables. > Worse, the progression caused by the comparatively large scale > factor may make it vacuum large tables too seldom. >
I might be missing your point but could you elaborate on that in what kind of case you think this lead to unnecessary vacuums? Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services