On Thu, Oct 24, 2019 at 4:33 PM Dilip Kumar <dilipbal...@gmail.com> wrote: > > On Thu, Oct 24, 2019 at 4:21 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > On Thu, Oct 24, 2019 at 11:51 AM Dilip Kumar <dilipbal...@gmail.com> wrote: > > > > > > On Fri, Oct 18, 2019 at 12:18 PM Dilip Kumar <dilipbal...@gmail.com> > > > wrote: > > > > > > > > On Fri, Oct 18, 2019 at 11:25 AM Amit Kapila <amit.kapil...@gmail.com> > > > > wrote: > > > > > > > > > > I am thinking if we can write the patch for both the approaches (a. > > > > > compute shared costs and try to delay based on that, b. try to divide > > > > > the I/O cost among workers as described in the email above[1]) and do > > > > > some tests to see the behavior of throttling, that might help us in > > > > > deciding what is the best strategy to solve this problem, if any. > > > > > What do you think? > > > > > > > > I agree with this idea. I can come up with a POC patch for approach > > > > (b). Meanwhile, if someone is interested to quickly hack with the > > > > approach (a) then we can do some testing and compare. Sawada-san, > > > > by any chance will you be interested to write POC with approach (a)? > > > > Otherwise, I will try to write it after finishing the first one > > > > (approach b). > > > > > > > I have come up with the POC for approach (a).
> > Can we compute the overall throttling (sleep time) in the operation > > separately for heap and index, then divide the index's sleep_time with > > a number of workers and add it to heap's sleep time? Then, it will be > > a bit easier to compare the data between parallel and non-parallel > > case. I have come up with a patch to compute the total delay during the vacuum. So the idea of computing the total cost delay is Total cost delay = Total dealy of heap scan + Total dealy of index/worker; Patch is attached for the same. I have prepared this patch on the latest patch of the parallel vacuum[1]. I have also rebased the patch for the approach [b] for dividing the vacuum cost limit and done some testing for computing the I/O throttling. Attached patches 0001-POC-compute-total-cost-delay and 0002-POC-divide-vacuum-cost-limit can be applied on top of v31-0005-Add-paralell-P-option-to-vacuumdb-command.patch. I haven't rebased on top of v31-0006, because v31-0006 is implementing the I/O throttling with one approach and 0002-POC-divide-vacuum-cost-limit is doing the same with another approach. But, 0001-POC-compute-total-cost-delay can be applied on top of v31-0006 as well (just 1-2 lines conflict). Testing: I have performed 2 tests, one with the same size indexes and second with the different size indexes and measured total I/O delay with the attached patch. Setup: VacuumCostDelay=10ms VacuumCostLimit=2000 Test1 (Same size index): create table test(a int, b varchar, c varchar); create index idx1 on test(a); create index idx2 on test(b); create index idx3 on test(c); insert into test select i, repeat('a',30)||i, repeat('a',20)||i from generate_series(1,500000) as i; delete from test where a < 200000; Vacuum (Head) Parallel Vacuum Vacuum Cost Divide Patch Total Delay 1784 (ms) 1398(ms) 1938(ms) Test2 (Variable size dead tuple in index) create table test(a int, b varchar, c varchar); create index idx1 on test(a); create index idx2 on test(b) where a > 100000; create index idx3 on test(c) where a > 150000; insert into test select i, repeat('a',30)||i, repeat('a',20)||i from generate_series(1,500000) as i; delete from test where a < 200000; Vacuum (Head) Parallel Vacuum Vacuum Cost Divide Patch Total Delay 1438 (ms) 1029(ms) 1529(ms) Conclusion: 1. The tests prove that the total I/O delay is significantly less with the parallel vacuum. 2. With the vacuum cost divide the problem is solved but the delay bit more compared to the non-parallel version. The reason could be the problem discussed at[2], but it needs further investigation. Next, I will test with the v31-0006 (shared vacuum cost) patch. I will also try to test different types of indexes. [1] https://www.postgresql.org/message-id/CAD21AoBMo9dr_QmhT%3DdKh7fmiq7tpx%2ByLHR8nw9i5NZ-SgtaVg%40mail.gmail.com [2] https://www.postgresql.org/message-id/CAA4eK1%2BPeiFLdTuwrE6CvbNdx80E-O%3DZxCuWB2maREKFD-RaCA%40mail.gmail.com -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
0001-POC-compute-total-cost-delay.patch
Description: Binary data
0002-POC-divide-vacuum-cost-limit.patch
Description: Binary data