On Thu, Oct 31, 2019 at 3:45 PM Dilip Kumar <dilipbal...@gmail.com> wrote: > > On Thu, Oct 31, 2019 at 11:33 AM Dilip Kumar <dilipbal...@gmail.com> wrote: > > > > On Tue, Oct 29, 2019 at 1:59 PM Masahiko Sawada <sawada.m...@gmail.com> > > wrote: > > > Actually after increased shared_buffer I got expected results: > > > > > > * Test1 (after increased shared_buffers) > > > normal : 2807 ms (hit 56295, miss 2, dirty 3, total 56300) > > > 2 workers : 2840 ms (hit 56295, miss 2, dirty 3, total 56300) > > > 1 worker : 2841 ms (hit 56295, miss 2, dirty 3, total 56300) > > > > > > I updated the patch that computes the total cost delay shared by > > > Dilip[1] so that it collects the number of buffer hits and so on, and > > > have attached it. It can be applied on top of my latest patch set[1]. > > While reading your modified patch (PoC-delay-stats.patch), I have > noticed that in my patch I used below formulae to compute the total > delay > total delay = delay in heap scan + (total delay of index scan > /nworkers). But, in your patch, I can see that it is just total sum of > all delay. IMHO, the total sleep time during the index vacuum phase > must be divided by the number of workers, because even if at some > point, all the workers go for sleep (e.g. 10 msec) then the delay in > I/O will be only for 10msec not 30 msec. I think the same is > discussed upthread[1] >
I think that two approaches make parallel vacuum worker wait in different way: in approach(a) the vacuum delay works as if vacuum is performed by single process, on the other hand in approach(b) the vacuum delay work for each workers independently. Suppose that the total number of blocks to vacuum is 10,000 blocks, the cost per blocks is 10, the cost limit is 200 and sleep time is 5 ms. In single process vacuum the total sleep time is 2,500ms (= (10,000 * 10 / 200) * 5). The approach (a) is the same, 2,500ms. Because all parallel vacuum workers use the shared balance value and a worker sleeps once the balance value exceeds the limit. In approach(b), since the cost limit is divided evenly the value of each workers is 40 (e.g. when 5 parallel degree). And suppose each workers processes blocks evenly, the total sleep time of all workers is 12,500ms (=(2,000 * 10 / 40) * 5 * 5). I think that's why we can compute the sleep time of approach(b) by dividing the total value by the number of parallel workers. IOW the approach(b) makes parallel vacuum delay much more than normal vacuum and parallel vacuum with approach(a) even with the same settings. Which behaviors do we expect? I thought the vacuum delay for parallel vacuum should work as if it's a single process vacuum as we did for memory usage. I might be missing something. If we prefer approach(b) I should change the patch so that the leader process divides the cost limit evenly. Regards, -- Masahiko Sawada