On Thu, Oct 24, 2019 at 3:21 PM Dilip Kumar <dilipbal...@gmail.com> wrote: > > On Fri, Oct 18, 2019 at 12:18 PM Dilip Kumar <dilipbal...@gmail.com> wrote: > > > > On Fri, Oct 18, 2019 at 11:25 AM Amit Kapila <amit.kapil...@gmail.com> > > wrote: > > > > > > On Fri, Oct 18, 2019 at 8:45 AM Dilip Kumar <dilipbal...@gmail.com> wrote: > > > > > > > > On Thu, Oct 17, 2019 at 4:00 PM Amit Kapila <amit.kapil...@gmail.com> > > > > wrote: > > > > > > > > > > On Thu, Oct 17, 2019 at 3:25 PM Dilip Kumar <dilipbal...@gmail.com> > > > > > wrote: > > > > > > > > > > > > On Thu, Oct 17, 2019 at 2:12 PM Masahiko Sawada > > > > > > <sawada.m...@gmail.com> wrote: > > > > > > > > > > > > > > On Thu, Oct 17, 2019 at 5:30 PM Amit Kapila > > > > > > > <amit.kapil...@gmail.com> wrote: > > > > > > > > > > > > > > > > Another point in this regard is that the user anyway has an > > > > > > > > option to > > > > > > > > turn off the cost-based vacuum. By default, it is anyway > > > > > > > > disabled. > > > > > > > > So, if the user enables it we have to provide some sensible > > > > > > > > behavior. > > > > > > > > If we can't come up with anything, then, in the end, we might > > > > > > > > want to > > > > > > > > turn it off for a parallel vacuum and mention the same in docs, > > > > > > > > but I > > > > > > > > think we should try to come up with a solution for it. > > > > > > > > > > > > > > I finally got your point and now understood the need. And the > > > > > > > idea I > > > > > > > proposed doesn't work fine. > > > > > > > > > > > > > > So you meant that all workers share the cost count and if a > > > > > > > parallel > > > > > > > vacuum worker increase the cost and it reaches the limit, does the > > > > > > > only one worker sleep? Is that okay even though other parallel > > > > > > > workers > > > > > > > are still running and then the sleep might not help? > > > > > > > > > > > > > > > > > Remember that the other running workers will also increase > > > > > VacuumCostBalance and whichever worker finds that it becomes greater > > > > > than VacuumCostLimit will reset its value and sleep. So, won't this > > > > > make sure that overall throttling works the same? > > > > > > > > > > > I agree with this point. There is a possibility that some of the > > > > > > workers who are doing heavy I/O continue to work and OTOH other > > > > > > workers who are doing very less I/O might become the victim and > > > > > > unnecessarily delay its operation. > > > > > > > > > > > > > > > > Sure, but will it impact the overall I/O? I mean to say the rate > > > > > limit we want to provide for overall vacuum operation will still be > > > > > the same. Also, isn't a similar thing happens now also where heap > > > > > might have done a major portion of I/O but soon after we start > > > > > vacuuming the index, we will hit the limit and will sleep. > > > > > > > > Actually, What I meant is that the worker who performing actual I/O > > > > might not go for the delay and another worker which has done only CPU > > > > operation might pay the penalty? So basically the worker who is doing > > > > CPU intensive operation might go for the delay and pay the penalty and > > > > the worker who is performing actual I/O continues to work and do > > > > further I/O. Do you think this is not a practical problem? > > > > > > > > > > I don't know. Generally, we try to delay (if required) before > > > processing (read/write) one page which means it will happen for I/O > > > intensive operations, so I am not sure if the point you are making is > > > completely correct. > > > > Ok, I agree with the point that we are checking it only when we are > > doing the I/O operation. But, we also need to consider that each I/O > > operations have a different weightage. So even if we have a delay > > point at I/O operation there is a possibility that we might delay the > > worker which is just performing read buffer with page > > hit(VacuumCostPageHit). But, the other worker who is actually > > dirtying the page(VacuumCostPageDirty = 20) continue the work and do > > more I/O. > > > > > > > > > Stepping back a bit, OTOH, I think that we can not guarantee that the > > > > one worker who has done more I/O will continue to do further I/O and > > > > the one which has not done much I/O will not perform more I/O in > > > > future. So it might not be too bad if we compute shared costs as you > > > > suggested above. > > > > > > > > > > I am thinking if we can write the patch for both the approaches (a. > > > compute shared costs and try to delay based on that, b. try to divide > > > the I/O cost among workers as described in the email above[1]) and do > > > some tests to see the behavior of throttling, that might help us in > > > deciding what is the best strategy to solve this problem, if any. > > > What do you think? > > > > I agree with this idea. I can come up with a POC patch for approach > > (b). Meanwhile, if someone is interested to quickly hack with the > > approach (a) then we can do some testing and compare. Sawada-san, > > by any chance will you be interested to write POC with approach (a)? > > Otherwise, I will try to write it after finishing the first one > > (approach b). > > > I have come up with the POC for approach (a). > > The idea is > 1) Before launching the worker divide the current VacuumCostBalance > among workers so that workers start accumulating the balance from that > point. > 2) Also, divide the VacuumCostLimit among the workers. > 3) Once the worker are done with the index vacuum, send back the > remaining balance with the leader. > 4) The leader will sum all the balances and add that to its current > VacuumCostBalance. And start accumulating its balance from this > point. > > I was trying to test how is the behaviour of the vacuum I/O limit, but > I could not find an easy way to test that so I just put the tracepoint > in the code and just checked that at what point we are giving the > delay. > I also printed the cost balance at various point to see that after how > much I/O accumulation we are hitting the delay. Please feel free to > suggest a better way to test this. > > I have printed these logs for parallel vacuum patch (v30) vs v(30) + > patch for dividing i/o limit (attached with the mail) > > Note: Patch and the test results are attached. >
Thank you! For approach (a) the basic idea I've come up with is that we have a shared balance value on DSM and each workers including the leader process add its local balance value to it in vacuum_delay_point, and then based on the shared value workers sleep. I'll submit that patch with other updates. Regards, -- Masahiko Sawada