On Fri, Dec 3, 2021 at 6:06 PM Masahiko Sawada <sawada.m...@gmail.com> wrote: > > On Fri, Dec 3, 2021 at 6:03 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > On Thu, Dec 2, 2021 at 6:01 PM Masahiko Sawada <sawada.m...@gmail.com> > > wrote: > > > > > > I've attached updated patches. > > > > > > > I have a few comments on v4-0001. > > Thank you for the comments! > > > 1. > > In parallel_vacuum_process_all_indexes(), we can combine the two > > checks for vacuum/cleanup at the beginning of the function > > Agreed. > > > and I think > > it is better to keep the variable name as bulkdel or cleanup instead > > of vacuum as that is more specific and clear. > > I was thinking to use the terms "bulkdel" and "cleanup" instead of > "vacuum" and "cleanup" for the same reason. That way, probably we can > use “bulkdel" and “cleanup" when doing index bulk-deletion (i.g., > calling to ambulkdelete) and index cleanup (calling to > amvacuumcleanup), respectively, and use "vacuum" when processing an > index, i.g., doing either bulk-delete or cleanup, instead of using > just "processing" . But we already use “vacuum” and “cleanup” in many > places, e.g., lazy_vacuum_index() and lazy_cleanup_index(). If we want > to use “bulkdel” instead of “vacuum”, I think it would be better to > change the terminology in vacuumlazy.c thoroughly, probably in a > separate patch. >
Okay. > > 2. The patch seems to be calling parallel_vacuum_should_skip_index > > thrice even before starting parallel vacuum. It has a call to find the > > number of blocks which has to be performed for each index. I > > understand it might not be too costly to call this but it seems better > > to remember this info like we are doing in the current code. > > Yes, we can bring will_vacuum_parallel array back to the code. That > way, we can remove the call to parallel_vacuum_should_skip_index() in > parallel_vacuum_begin(). > > > We can > > probably set parallel_workers_can_process in parallel_vacuum_begin and > > then again update in parallel_vacuum_process_all_indexes. Won't doing > > something like that be better? > > parallel_workers_can_process can vary depending on bulk-deletion, the > first time cleanup, or the second time (or more) cleanup. What can we > set parallel_workers_can_process based on in parallel_vacuum_begin()? > I was thinking to set the results of will_vacuum_parallel in parallel_vacuum_begin(). > > > > 3. /* > > * Copy the index bulk-deletion result returned from ambulkdelete and > > @@ -2940,19 +2935,20 @@ parallel_process_one_index(Relation indrel, > > * Since all vacuum workers write the bulk-deletion result at different > > * slots we can write them without locking. > > */ > > - if (shared_istat && !shared_istat->updated && istat_res != NULL) > > + if (!pindstats->istat_updated && istat_res != NULL) > > { > > - memcpy(&shared_istat->istat, istat_res, sizeof(IndexBulkDeleteResult)); > > - shared_istat->updated = true; > > + memcpy(&(pindstats->istat), istat_res, sizeof(IndexBulkDeleteResult)); > > + pindstats->istat_updated = true; > > > > /* Free the locally-allocated bulk-deletion result */ > > pfree(istat_res); > > - > > - /* return the pointer to the result from shared memory */ > > - return &shared_istat->istat; > > } > > > > I think here now we copy the results both for local and parallel > > cleanup. Isn't it better to write something about that in comments as > > it is not clear from current comments? > > What do you mean by "local cleanup"? > Clean-up performed by leader backend. > > > > 4. > > + /* Estimate size for shared information -- PARALLEL_VACUUM_KEY_SHARED */ > > + est_shared_len = MAXALIGN(sizeof(LVShared)); > > shm_toc_estimate_chunk(&pcxt->estimator, est_shared_len); > > > > Do we need MAXALIGN here? I think previously we required it here > > because immediately after this we were writing index stats but now > > those are allocated separately. Normally, shm_toc_estimate_chunk() > > aligns the size but sometimes we need to do it so as to avoid > > unaligned accessing of shared mem. I am really not sure whether we > > require it for dead_items, do you remember why it is there in the > > first place. > > Indeed. I don't remember that. Probably it's an oversight. > Yeah, I also think so. -- With Regards, Amit Kapila.