On Fri, Oct 4, 2019 at 7:34 PM Masahiko Sawada <sawada.m...@gmail.com> wrote:
> On Fri, Oct 4, 2019 at 2:02 PM Amit Kapila <amit.kapil...@gmail.com> > wrote: > >> > >> I'd also prefer to use maintenance_work_mem at max during parallel > >> vacuum regardless of the number of parallel workers. This is current > >> implementation. In lazy vacuum the maintenance_work_mem is used to > >> record itempointer of dead tuples. This is done by leader process and > >> worker processes just refers them for vacuuming dead index tuples. > >> Even if user sets a small amount of maintenance_work_mem the parallel > >> vacuum would be helpful as it still would take a time for index > >> vacuuming. So I thought we should cap the number of parallel workers > >> by the number of indexes rather than maintenance_work_mem. > >> > > > > Isn't that true only if we never use maintenance_work_mem during index > cleanup? However, I think we are using during index cleanup, see forex. > ginInsertCleanup. I think before reaching any conclusion about what to do > about this, first we need to establish whether this is a problem. If I am > correct, then only some of the index cleanups (like gin index) use > maintenance_work_mem, so we need to consider that point while designing a > solution for this. > > > > I got your point. Currently the single process lazy vacuum could > consume the amount of (maintenance_work_mem * 2) memory at max because > we do index cleanup during holding the dead tuple space as you > mentioned. And ginInsertCleanup is also be called at the beginning of > ginbulkdelete. In current parallel lazy vacuum, each parallel vacuum > worker could consume other memory apart from the memory used by heap > scan depending on the implementation of target index AM. Given that > the current single and parallel vacuum implementation it would be > better to control the amount memory in total rather than the number of > parallel workers. So one approach I came up with is that we make all > vacuum workers use the amount of (maintenance_work_mem / # of > participants) as new maintenance_work_mem. Yeah, we can do something like that, but I am not clear whether the current memory usage for Gin indexes is correct. I have started a new thread, let's discuss there. [1] - https://www.postgresql.org/message-id/CAA4eK1LmcD5aPogzwim5Nn58Ki%2B74a6Edghx4Wd8hAskvHaq5A%40mail.gmail.com -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com