Re: [HACKERS] Block level parallel vacuum

Masahiko Sawada Thu, 12 Dec 2019 20:34:22 -0800

Sorry for the late reply.

On Fri, 6 Dec 2019 at 14:20, Amit Kapila <amit.kapil...@gmail.com> wrote:
>
> On Thu, Dec 5, 2019 at 7:44 PM Robert Haas <robertmh...@gmail.com> wrote:
> >
> > I think it might be a good idea to change what we expect index AMs to
> > do rather than trying to make anything that they happen to be doing
> > right now work, no matter how crazy. In particular, suppose we say
> > that you CAN'T add data on to the end of IndexBulkDeleteResult any
> > more, and that instead the extra data is passed through a separate
> > parameter. And then you add an estimate method that gives the size of
> > the space provided by that parameter (and if the estimate method isn't
> > defined then the extra parameter is passed as NULL) and document that
> > the data stored there might get flat-copied.
> >
>
> I think this is a good idea and serves the purpose we are trying to
> achieve currently.  However, if there are any IndexAM that is using
> the current way to pass stats with additional information, they would
> need to change even if they don't want to use parallel vacuum
> functionality (say because their indexes are too small or whatever
> other reasons).  I think this is a reasonable trade-off and the
> changes on their end won't be that big.  So, we should do this.
>
> > Now, you've taken the
> > onus off of parallel vacuum to cope with any crazy thing a
> > hypothetical AM might be doing, and instead you've defined the
> > behavior of that hypothetical AM as wrong. If somebody really needs
> > that, it's now their job to modify the index AM machinery further
> > instead of your job to somehow cope.
> >
>
> makes sense.
>
> > > Here, we have a need to reduce the number of workers.  Index Vacuum
> > > has two different phases (index vacuum and index cleanup) which uses
> > > the same parallel-context/DSM but both could have different
> > > requirements for workers.  The second phase (cleanup) would normally
> > > need fewer workers as if the work is done in the first phase, second
> > > wouldn't need it, but we have exceptions like gin indexes where we
> > > need it for the second phase as well because it takes the pass
> > > over-index again even if we have cleaned the index in the first phase.
> > > Now, consider the case where we have 3 btree indexes and 2 gin
> > > indexes, we would need 5 workers for index vacuum phase and 2 workers
> > > for index cleanup phase.  There are other cases too.
> > >
> > > We also considered to have a separate DSM for each phase, but that
> > > appeared to have overhead without much benefit.
> >
> > How about adding an additional argument to ReinitializeParallelDSM()
> > that allows the number of workers to be reduced? That seems like it
> > would be less confusing than what you have now, and would involve
> > modify code in a lot fewer places.
> >
>
> Yeah, we can do that.  We can maintain some information in
> LVParallelState which indicates whether we need to reinitialize the
> DSM before launching workers.  Sawada-San, do you see any problem with
> this idea?


I think the number of workers could be increased in cleanup phase. For
example, if we have 1 brin index and 2 gin indexes then in bulkdelete
phase we need only 1 worker but in cleanup we need 2 workers.

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Block level parallel vacuum

Reply via email to