Re: Parallel heap vacuum

Masahiko Sawada Thu, 19 Dec 2024 14:06:24 -0800

On Sat, Dec 14, 2024 at 1:24 PM Tomas Vondra <to...@vondra.me> wrote:
>
> On 12/13/24 00:04, Tomas Vondra wrote:
> > ...
> >
> > The main difference is here:
> >
> >
> > master / no parallel workers:
> >
> >   pages: 0 removed, 221239 remain, 221239 scanned (100.00% of total)
> >
> > 1 parallel worker:
> >
> >   pages: 0 removed, 221239 remain, 10001 scanned (4.52% of total)
> >
> >
> > Clearly, with parallel vacuum we scan only a tiny fraction of the pages,
> > essentially just those with deleted tuples, which is ~1/20 of pages.
> > That's close to the 15x speedup.
> >
> > This effect is clearest without indexes, but it does affect even runs
> > with indexes - having to scan the indexes makes it much less pronounced,
> > though. However, these indexes are pretty massive (about the same size
> > as the table) - multiple times larger than the table. Chances are it'd
> > be clearer on realistic data sets.
> >
> > So the question is - is this correct? And if yes, why doesn't the
> > regular (serial) vacuum do that?
> >
> > There's some more strange things, though. For example, how come the avg
> > read rate is 0.000 MB/s?
> >
> >   avg read rate: 0.000 MB/s, avg write rate: 525.533 MB/s
> >
> > It scanned 10k pages, i.e. ~80MB of data in 0.15 seconds. Surely that's
> > not 0.000 MB/s? I guess it's calculated from buffer misses, and all the
> > pages are in shared buffers (thanks to the DELETE earlier in that session).
> >
>
> OK, after looking into this a bit more I think the reason is rather
> simple - SKIP_PAGES_THRESHOLD.
>
> With serial runs, we end up scanning all pages, because even with an
> update every 5000 tuples, that's still only ~25 pages apart, well within
> the 32-page window. So we end up skipping no pages, scan and vacuum all
> everything.
>
> But parallel runs have this skipping logic disabled, or rather the logic
> that switches to sequential scans if the gap is less than 32 pages.
>
>
> IMHO this raises two questions:
>
> 1) Shouldn't parallel runs use SKIP_PAGES_THRESHOLD too, i.e. switch to
> sequential scans is the pages are close enough. Maybe there is a reason
> for this difference? Workers can reduce the difference between random
> and sequential I/0, similarly to prefetching. But that just means the
> workers should use a lower threshold, e.g. as
>
>     SKIP_PAGES_THRESHOLD / nworkers
>
> or something like that? I don't see this discussed in this thread.


Each parallel heap scan worker allocates a chunk of blocks which is
8192 blocks at maximum, so we would need to use the
SKIP_PAGE_THRESHOLD optimization within the chunk. I agree that we
need to evaluate the differences anyway. WIll do the benchmark test
and share the results.

>
> 2) It seems the current SKIP_PAGES_THRESHOLD is awfully high for good
> storage. If I can get an order of magnitude improvement (or more than
> that) by disabling the threshold, and just doing random I/O, maybe
> there's time to adjust it a bit.

Yeah, you've started a thread for this so let's discuss it there.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Parallel heap vacuum

Reply via email to