On Tue, Dec 18, 2018 at 1:29 PM Masahiko Sawada <sawada.m...@gmail.com> wrote: > > Attached the updated patches. I scaled back the scope of this patch. > The patch now includes only feature (a), that is it execute both index > vacuum and cleanup index in parallel. It also doesn't include > autovacuum support for now. > > The PARALLEL option works alomst same as before patch. In VACUUM > command, we can specify 'PARALLEL n' option where n is the number of > parallel workers to request. If the n is omitted the number of > parallel worekrs would be # of indexes -1. >
I think for now this is okay, but I guess in furture when we make heapscans also parallel or maybe allow more than one worker per-index vacuum, then this won't hold good. So, I am not sure if below text in docs is most appropriate. + <term><literal>PARALLEL <replaceable class="parameter">N</replaceable></literal></term> + <listitem> + <para> + Execute index vacuum and cleanup index in parallel with + <replaceable class="parameter">N</replaceable> background workers. If the parallel + degree <replaceable class="parameter">N</replaceable> is omitted, + <command>VACUUM</command> requests the number of indexes - 1 processes, which is the + maximum number of parallel vacuum workers since individual indexes is processed by + one process. The actual number of parallel vacuum workers may be less due to the + setting of <xref linkend="guc-max-parallel-workers-maintenance"/>. + This option can not use with <literal>FULL</literal> option. It might be better to use some generic language in docs, something like "If the parallel degree N is omitted, then vacuum decides the number of workers based on number of indexes on the relation which is further limited by max-parallel-workers-maintenance". I think you also need to mention in some way that you consider storage option parallel_workers. Few assorted comments: 1. +lazy_begin_parallel_vacuum_index(LVState *lvstate, bool for_cleanup) { .. + + LaunchParallelWorkers(lvstate->pcxt); + + /* + * if no workers launched, we vacuum all indexes by the leader process + * alone. Since there is hope that we can launch workers in the next + * execution time we don't want to end the parallel mode yet. + */ + if (lvstate->pcxt->nworkers_launched == 0) + return; It is quite possible that the workers are not launched because we fail to allocate memory, basically when pcxt->nworkers is zero. I think in such cases there is no use for being in parallel mode. You can even detect that before calling lazy_begin_parallel_vacuum_index. 2. static void +lazy_vacuum_all_indexes_for_leader(LVState *lvstate, IndexBulkDeleteResult **stats, + LVTidMap *dead_tuples, bool do_parallel, + bool for_cleanup) { .. + if (do_parallel) + lazy_begin_parallel_vacuum_index(lvstate, for_cleanup); + + for (;;) + { + IndexBulkDeleteResult *r = NULL; + + /* + * Get the next index number to vacuum and set index statistics. In parallel + * lazy vacuum, index bulk-deletion results are stored in the shared memory + * segment. If it's already updated we use it rather than setting it to NULL. + * In single vacuum, we can always use an element of the 'stats'. + */ + if (do_parallel) + { + idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1); + + if (lvshared->indstats[idx].updated) + r = &(lvshared->indstats[idx].stats); + } It is quite possible that we are not able to launch any workers in lazy_begin_parallel_vacuum_index, in such cases, we should not use parallel mode path, basically as written we can't rely on 'do_parallel' flag. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com