On Tue, Nov 5, 2019 at 8:49 PM Andres Freund <and...@anarazel.de> wrote: > > Hi, > > On November 5, 2019 7:16:41 AM PST, Dilip Kumar <dilipbal...@gmail.com> wrote: > >On Tue, Nov 5, 2019 at 2:40 PM Amit Kapila <amit.kapil...@gmail.com> > >wrote: > >> > >> On Mon, Nov 4, 2019 at 11:58 PM Andres Freund <and...@anarazel.de> > >wrote: > >> > > >> > Hi, > >> > > >> > On 2019-11-04 12:59:02 -0500, Jeff Janes wrote: > >> > > On Mon, Nov 4, 2019 at 1:54 AM Amit Kapila > ><amit.kapil...@gmail.com> wrote: > >> > > > >> > > > For parallel vacuum [1], we were discussing what is the best > >way to > >> > > > divide the cost among parallel workers but we didn't get many > >inputs > >> > > > apart from people who are very actively involved in patch > >development. > >> > > > I feel that we need some more inputs before we finalize > >anything, so > >> > > > starting a new thread. > >> > > > > >> > > > >> > > Maybe a I just don't have experience in the type of system that > >parallel > >> > > vacuum is needed for, but if there is any meaningful IO > >throttling which is > >> > > active, then what is the point of doing the vacuum in parallel in > >the first > >> > > place? > >> > > >> > I am wondering the same - but to be fair, it's pretty easy to run > >into > >> > cases where VACUUM is CPU bound. E.g. because most pages are in > >> > shared_buffers, and compared to the size of the indexes number of > >tids > >> > that need to be pruned is fairly small (also [1]). That means a lot > >of > >> > pages need to be scanned, without a whole lot of IO going on. The > >> > problem with that is just that the defaults for vacuum throttling > >will > >> > also apply here, I've never seen anybody tune vacuum_cost_page_hit > >= 0, > >> > vacuum_cost_page_dirty=0 or such (in contrast, the latter is the > >highest > >> > cost currently). Nor do we reduce the cost of > >vacuum_cost_page_dirty > >> > for unlogged tables. > >> > > >> > So while it doesn't seem unreasonable to want to use cost limiting > >to > >> > protect against vacuum unexpectedly causing too much, especially > >read, > >> > IO, I'm doubtful it has current practical relevance. > >> > > >> > >> IIUC, you mean to say that it is of not much practical use to do > >> parallel vacuum if I/O throttling is enabled for an operation, is > >that > >> right? > >> > >> > >> > I'm wondering how much of the benefit of parallel vacuum really is > >just > >> > to work around vacuum ringbuffers often massively hurting > >performance > >> > (see e.g. [2]). > >> > > >> > >> Yeah, it is a good thing to check, but if anything, I think a > >parallel > >> vacuum will further improve the performance with larger ring buffers > >> as it will make it more CPU bound. > >I have tested the same and the results prove that increasing the ring > >buffer size we can see the performance gain. And, the gain is much > >more with the parallel vacuum. > > > >Test case: > >create table test(a int, b int, c int, d int, e int, f int, g int, h > >int); > >create index idx1 on test(a); > >create index idx2 on test(b); > >create index idx3 on test(c); > >create index idx4 on test(d); > >create index idx5 on test(e); > >create index idx6 on test(f); > >create index idx7 on test(g); > >create index idx8 on test(h); > >insert into test select i,i,i,i,i,i,i,i from generate_series(1,1000000) > >as i; > >delete from test where a < 300000; > > > >( I have tested the parallel vacuum and non-parallel vacuum with > >different ring buffer size) > > Thanks! > > >8 index > >ring buffer size 246kb-> non-parallel: 7.6 seconds parallel (2 > >worker): 3.9 seconds > >ring buffer size 256mb-> non-parallel: 6.1 seconds parallel (2 > >worker): 3.2 seconds > > > >4 index > >ring buffer size 246kb -> non-parallel: 4.8 seconds parallel (2 > >worker): 3.2 seconds > >ring buffer size 256mb -> non-parallel: 3.8 seconds parallel (2 > >worker): 2.6 seconds > > What about the case of just disabling the ring buffer logic? > Repeated the same test by disabling the ring buffer logic. Results are almost same as increasing the ring buffer size.
Tested with 4GB shared buffers: 8 index use shared buffers -> non-parallel: 6.2seconds parallel (2 worker): 3.3seconds 4 index use shared buffer -> non-parallel: 3.8seconds parallel (2 worker): 2.7seconds -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com