On Fri, Feb 14, 2025 at 2:21 PM Melanie Plageman <melanieplage...@gmail.com> wrote: > > On Wed, Feb 12, 2025 at 5:37 PM Masahiko Sawada <sawada.m...@gmail.com> wrote: > > > > Since we introduced the eagar vacuum scan (052026c9b9), I need to > > update the parallel heap vacuum patch. After thinking of how to > > integrate these two features, I find some complexities. The region > > size used by eager vacuum scan and the chunk size used by parallel > > table scan are different. While the region is fixed size the chunk > > becomes smaller as we scan the table. A chunk of the table that a > > parallel vacuum worker took could be across different regions or be > > within one region, and different parallel heap vacuum workers might > > scan the same region. And parallel heap vacuum workers could be > > scanning different regions of the table simultaneously.
Thank you for your feedback. > Ah, I see. What are the chunk size ranges? I picked a 32 MB region > size after a little testing and mostly because it seemed reasonable. I > think it would be fine to use different region size. Parallel workers > could just consider the chunk they get an eager scan region (unless it > is too small or too large -- then it might not make sense). The maximum chunk size is 8192 blocks, 64MB. As we scan the table, we ramp down the chunk size. It would eventually become 1. > > > During eager vacuum scan, we reset the eager_scan_remaining_fails > > counter when we start to scan the new region. So if we want to make > > parallel heap vacuum behaves exactly the same way as the > > single-progress vacuum in terms of the eager vacuum scan, we would > > need to have the eager_scan_remaining_fails counters for each region > > so that the workers can decrement it corresponding to the region of > > the block that the worker is scanning. But I'm concerned that it makes > > the logic very complex. I'd like to avoid making newly introduced > > codes more complex by adding yet another new code on top of that. > > I don't think it would have to behave exactly the same. I think we > just don't want to add a lot of complexity or make it hard to reason > about. > > Since the failure rate is defined as a percent, couldn't we just have > parallel workers set eager_scan_remaining_fails when they get their > chunk assignment (as a percentage of their chunk size)? (I haven't > looked at the code, so maybe this doesn't make sense). IIUC since the chunk size eventually becomes 1, we cannot simply just have parallel workers set the failure rate to its assigned chunk. > > For the success cap, we could have whoever hits it first disable eager > scanning for all future assigned chunks. Agreed. > > > Another idea is to disable the eager vacuum scan when parallel heap > > vacuum is enabled. It might look like just avoiding difficult things > > but it could make sense in a sense. The eager vacuum scan is aimed to > > amortize the aggressive vacuum by incrementally freezing pages that > > are potentially frozen by the next aggressive vacuum. On the other > > hand, parallel heap vacuum is available only in manual VACUUM and > > would be used to remove garbage on a large table as soon as possible > > or to freeze the entire table to avoid reaching the XID limit. So I > > think it might make sense to disable the eager vacuum scan when > > parallel vacuum. > > Do we only do parallelism in manual vacuum because we don't want to > use up too many parallel workers for a maintenance subsystem? I never > really tried to find out why parallel index vacuuming is only in > manual vacuum. I assume you made the same choice they did for the same > reasons. > > If the idea is to never allow parallelism in vacuum, then I think > disabling eager scanning during manual parallel vacuum seems > reasonable. People could use vacuum freeze if they want more freezing. IIUC the purpose of parallel vacuum is incompatible with the purpose of auto vacuum. The former is aimed to execute the vacuum as fast as possible using more resources, whereas the latter is aimed to execute the vacuum while not affecting foreground transaction processing. It's probably worth considering to enable parallel vacuum even for autovacuum in a wraparound situation, but the purpose would remain the same. > Also, if you start with only doing parallelism for the third phase of > heap vacuuming (second pass over the heap), this wouldn't be a problem > because eager scanning only impacts the first phase. Right. I'm inclined to support only the second heap pass as the first step. If we support parallelism only for the second pass, it cannot help speed up freezing the entire table in emergency situations, but it would be beneficial for cases where a big table have a large amount of spread garbage. At least, I'm going to reorganize the patch set to support parallelism for the second pass first and then the first heap pass. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com