On Fri, Oct 18, 2019 at 3:48 PM Dilip Kumar <dilipbal...@gmail.com> wrote: > > On Fri, Oct 18, 2019 at 11:25 AM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > On Fri, Oct 18, 2019 at 8:45 AM Dilip Kumar <dilipbal...@gmail.com> wrote: > > > > > > On Thu, Oct 17, 2019 at 4:00 PM Amit Kapila <amit.kapil...@gmail.com> > > > wrote: > > > > > > > > On Thu, Oct 17, 2019 at 3:25 PM Dilip Kumar <dilipbal...@gmail.com> > > > > wrote: > > > > > > > > > > On Thu, Oct 17, 2019 at 2:12 PM Masahiko Sawada > > > > > <sawada.m...@gmail.com> wrote: > > > > > > > > > > > > On Thu, Oct 17, 2019 at 5:30 PM Amit Kapila > > > > > > <amit.kapil...@gmail.com> wrote: > > > > > > > > > > > > > > Another point in this regard is that the user anyway has an > > > > > > > option to > > > > > > > turn off the cost-based vacuum. By default, it is anyway > > > > > > > disabled. > > > > > > > So, if the user enables it we have to provide some sensible > > > > > > > behavior. > > > > > > > If we can't come up with anything, then, in the end, we might > > > > > > > want to > > > > > > > turn it off for a parallel vacuum and mention the same in docs, > > > > > > > but I > > > > > > > think we should try to come up with a solution for it. > > > > > > > > > > > > I finally got your point and now understood the need. And the idea I > > > > > > proposed doesn't work fine. > > > > > > > > > > > > So you meant that all workers share the cost count and if a parallel > > > > > > vacuum worker increase the cost and it reaches the limit, does the > > > > > > only one worker sleep? Is that okay even though other parallel > > > > > > workers > > > > > > are still running and then the sleep might not help? > > > > > > > > > > > > > > Remember that the other running workers will also increase > > > > VacuumCostBalance and whichever worker finds that it becomes greater > > > > than VacuumCostLimit will reset its value and sleep. So, won't this > > > > make sure that overall throttling works the same? > > > > > > > > > I agree with this point. There is a possibility that some of the > > > > > workers who are doing heavy I/O continue to work and OTOH other > > > > > workers who are doing very less I/O might become the victim and > > > > > unnecessarily delay its operation. > > > > > > > > > > > > > Sure, but will it impact the overall I/O? I mean to say the rate > > > > limit we want to provide for overall vacuum operation will still be > > > > the same. Also, isn't a similar thing happens now also where heap > > > > might have done a major portion of I/O but soon after we start > > > > vacuuming the index, we will hit the limit and will sleep. > > > > > > Actually, What I meant is that the worker who performing actual I/O > > > might not go for the delay and another worker which has done only CPU > > > operation might pay the penalty? So basically the worker who is doing > > > CPU intensive operation might go for the delay and pay the penalty and > > > the worker who is performing actual I/O continues to work and do > > > further I/O. Do you think this is not a practical problem? > > > > > > > I don't know. Generally, we try to delay (if required) before > > processing (read/write) one page which means it will happen for I/O > > intensive operations, so I am not sure if the point you are making is > > completely correct. > > Ok, I agree with the point that we are checking it only when we are > doing the I/O operation. But, we also need to consider that each I/O > operations have a different weightage. So even if we have a delay > point at I/O operation there is a possibility that we might delay the > worker which is just performing read buffer with page > hit(VacuumCostPageHit). But, the other worker who is actually > dirtying the page(VacuumCostPageDirty = 20) continue the work and do > more I/O. > > > > > > Stepping back a bit, OTOH, I think that we can not guarantee that the > > > one worker who has done more I/O will continue to do further I/O and > > > the one which has not done much I/O will not perform more I/O in > > > future. So it might not be too bad if we compute shared costs as you > > > suggested above. > > > > > > > I am thinking if we can write the patch for both the approaches (a. > > compute shared costs and try to delay based on that, b. try to divide > > the I/O cost among workers as described in the email above[1]) and do > > some tests to see the behavior of throttling, that might help us in > > deciding what is the best strategy to solve this problem, if any. > > What do you think? > > I agree with this idea. I can come up with a POC patch for approach > (b). Meanwhile, if someone is interested to quickly hack with the > approach (a) then we can do some testing and compare. Sawada-san, > by any chance will you be interested to write POC with approach (a)?
Yes, I will try to write the PoC patch with approach (a). Regards, -- Masahiko Sawada