Sorry for the late reply. On Fri, 6 Dec 2019 at 14:20, Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Thu, Dec 5, 2019 at 7:44 PM Robert Haas <robertmh...@gmail.com> wrote: > > > > I think it might be a good idea to change what we expect index AMs to > > do rather than trying to make anything that they happen to be doing > > right now work, no matter how crazy. In particular, suppose we say > > that you CAN'T add data on to the end of IndexBulkDeleteResult any > > more, and that instead the extra data is passed through a separate > > parameter. And then you add an estimate method that gives the size of > > the space provided by that parameter (and if the estimate method isn't > > defined then the extra parameter is passed as NULL) and document that > > the data stored there might get flat-copied. > > > > I think this is a good idea and serves the purpose we are trying to > achieve currently. However, if there are any IndexAM that is using > the current way to pass stats with additional information, they would > need to change even if they don't want to use parallel vacuum > functionality (say because their indexes are too small or whatever > other reasons). I think this is a reasonable trade-off and the > changes on their end won't be that big. So, we should do this. > > > Now, you've taken the > > onus off of parallel vacuum to cope with any crazy thing a > > hypothetical AM might be doing, and instead you've defined the > > behavior of that hypothetical AM as wrong. If somebody really needs > > that, it's now their job to modify the index AM machinery further > > instead of your job to somehow cope. > > > > makes sense. > > > > Here, we have a need to reduce the number of workers. Index Vacuum > > > has two different phases (index vacuum and index cleanup) which uses > > > the same parallel-context/DSM but both could have different > > > requirements for workers. The second phase (cleanup) would normally > > > need fewer workers as if the work is done in the first phase, second > > > wouldn't need it, but we have exceptions like gin indexes where we > > > need it for the second phase as well because it takes the pass > > > over-index again even if we have cleaned the index in the first phase. > > > Now, consider the case where we have 3 btree indexes and 2 gin > > > indexes, we would need 5 workers for index vacuum phase and 2 workers > > > for index cleanup phase. There are other cases too. > > > > > > We also considered to have a separate DSM for each phase, but that > > > appeared to have overhead without much benefit. > > > > How about adding an additional argument to ReinitializeParallelDSM() > > that allows the number of workers to be reduced? That seems like it > > would be less confusing than what you have now, and would involve > > modify code in a lot fewer places. > > > > Yeah, we can do that. We can maintain some information in > LVParallelState which indicates whether we need to reinitialize the > DSM before launching workers. Sawada-San, do you see any problem with > this idea?
I think the number of workers could be increased in cleanup phase. For example, if we have 1 brin index and 2 gin indexes then in bulkdelete phase we need only 1 worker but in cleanup we need 2 workers. Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services