On Mon, Jun 18, 2018 at 1:56 PM, Alexander Korotkov <a.korot...@postgrespro.ru> wrote: > Hi! > > On Sat, Jun 16, 2018 at 11:23 PM Darafei "Komяpa" Praliaskouski > <m...@komzpa.net> wrote: >> It is cool to see this in Postgres 11. However: >> >>> >>> 4) vacuum_cleanup_index_scale_factor can be set either by GUC or reloption. >>> Default value is 0.1. So, by default cleanup scan is triggered after >>> increasing of >>> table size by 10%. >> >> >> vacuum_cleanup_index_scale_factor can be set to the maximum of 100. >> I imagine that on a large append-only table with IOPS storage system budget >> it may happen that I would want to never perform a full scan on index. >> Roughly, with parameter set to 100, if we vacuum the table first time with 1 >> tuple and 130 byte wide rows, we'll have a full scan at 130 bytes, 12 >> kbytes, 1.2MB, 123MB, 12 GB, 1.2TB. >> >> If we happen to perform the first vacuum when there are 4 tuples in the >> table, it becomes 52kb, 5MB, 495MB, 48GB - and both 12GB and 48GB will >> exhaust any storage spike IOPS budget, slowing everything down rather >> suddenly. >> >> Can the upper limit for this GUC be lifted, or have a value for "never"? > > I have some further exploration of how statistics obtained by B-tree > index vacuum cleanup is used. > > 1) Collected pages and tuples numbers are not directly used, but used > for an estimation of tuples density per page, while current number of > page is estimated using smgr (see btcostestimate()). So, unless > density of tuples significantly changes, having index statistics > stalled doesn't affect query plans. > 2) Our optimization for skipping B-tree index vacuum cleanup works > only in case when use manually vacuums table in order to update > visibility map. Autovacuum is not triggered for append-only tables. > So, if user doesn't have special care about append-only tables, > they're not vacuumed until "autovacuum to prevent wraparound". Thus, > index statistics could be very stalled. And I don't think we have > many occurrences of issues with stalled index statistics. > 3) We have very safe defaul of vacuum_cleanup_index_scale_factor equal > to 1.1. But as Darafei claimed, 100 maximum value is probably too low > for advanced users, who really need benefits of this optimization.
Thank you for explanation. > So, I'm proposing to raise maximum valus of > vacuum_cleanup_index_scale_factor to DBL_MAX. Any objections? > I agree to expand the maximum value. But if users don't want index cleanup it would be helpful if we have an option (e.g. setting to -1) to disable index cleanup while documenting a risk of disabling index cleanup. It seems to me that setting very high values means the same purpose. Also, your patch lacks documentation update. BTW, I realized that postgresql.conf.sample doesn't have vacuum_cleanup_index_scale_factor option. Attached patch fixes it. Regards, -- Masahiko Sawada
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index fa3c8a7905..bf1fbca532 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -3253,7 +3253,7 @@ static struct config_real ConfigureNamesReal[] = NULL }, &vacuum_cleanup_index_scale_factor, - 0.1, 0.0, 100.0, + 0.1, 0.0, DBL_MAX, NULL, NULL, NULL },