On Sat, Nov 5, 2022 at 6:23 PM John Naylor <john.nay...@enterprisedb.com> wrote: > > On Fri, Nov 4, 2022 at 10:25 PM Masahiko Sawada <sawada.m...@gmail.com> wrote: > > > > For parallel heap pruning, multiple workers will insert key-value > > pairs to the radix tree concurrently. The simplest solution would be a > > single lock to protect writes but the performance will not be good. > > Another solution would be that we can divide the tables into multiple > > ranges so that keys derived from TIDs are not conflicted with each > > other and have parallel workers process one or more ranges. That way, > > parallel vacuum workers can build *sub-trees* and the leader process > > can merge them. In use cases of lazy vacuum, since the write phase and > > read phase are separated the readers don't need to worry about > > concurrent updates. > > It's a good idea to use ranges for a different reason -- readahead. See > commit 56788d2156fc3, which aimed to improve readahead for sequential scans. > It might work to use that as a model: Each worker prunes a range of 64 pages, > keeping the dead tids in a local array. At the end of the range: lock the tid > store, enter the tids into the store, unlock, free the local array, and get > the next range from the leader. It's possible contention won't be too bad, > and I suspect using small local arrays as-we-go would be faster and use less > memory than merging multiple sub-trees at the end.
Seems a promising idea. I think it might work well even in the current parallel vacuum (ie., single writer). I mean, I think we can have a single lwlock for shared cases in the first version. If the overhead of acquiring the lwlock per insertion of key-value is not negligible, we might want to try this idea. Apart from that, I'm going to incorporate the comments on 0004 patch and try a pointer tagging. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com