On Mon, Nov 26, 2018 at 2:08 PM Masahiko Sawada <sawada.m...@gmail.com> wrote: > > On Sun, Nov 25, 2018 at 2:35 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > On Sat, Nov 24, 2018 at 5:47 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > On Tue, Oct 30, 2018 at 2:04 PM Masahiko Sawada <sawada.m...@gmail.com> > > > wrote: > > > > > > > > > Thank you for the comment. > > > > I could see that you have put a lot of effort on this patch and still > > > we are not able to make much progress mainly I guess because of > > > relation extension lock problem. I think we can park that problem for > > > some time (as already we have invested quite some time on it), discuss > > > a bit about actual parallel vacuum patch and then come back to it. > > > > > > > Today, I was reading this and previous related thread [1] and it seems > > to me multiple people Andres [2], Simon [3] have pointed out that > > parallelization for index portion is more valuable. Also, some of the > > results [4] indicate the same. Now, when there are no indexes, > > parallelizing heap scans also have benefit, but I think in practice we > > will see more cases where the user wants to vacuum tables with > > indexes. So how about if we break this problem in the following way > > where each piece give the benefit of its own: > > (a) Parallelize index scans wherein the workers will be launched only > > to vacuum indexes. Only one worker per index will be spawned. > > (b) Parallelize per-index vacuum. Each index can be vacuumed by > > multiple workers. > > (c) Parallelize heap scans where multiple workers will scan the heap, > > collect dead TIDs and then launch multiple workers for indexes. > > > > I think if we break this problem into multiple patches, it will reduce > > the scope of each patch and help us in making progress. Now, it's > > been more than 2 years that we are trying to solve this problem, but > > still didn't make much progress. I understand there are various > > genuine reasons and all of that work will help us in solving all the > > problems in this area. How about if we first target problem (a) and > > once we are done with that we can see which of (b) or (c) we want to > > do first? > > Thank you for suggestion. It seems good to me. We would get a nice > performance scalability even by only (a), and vacuum will get more > powerful by (b) or (c). Also, (a) would not require to resovle the > relation extension lock issue IIUC. >
Yes, I also think so. We do acquire 'relation extension lock' during index vacuum, but as part of (a), we are talking one worker per-index, so there shouldn't be a problem with respect to deadlocks. > I'll change the patch and submit > to the next CF. > Okay. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com