Re: optimizing pg_upgrade's once-in-each-database steps

Nathan Bossart Thu, 01 Aug 2024 10:44:57 -0700

On Wed, Jul 31, 2024 at 10:55:33PM +0100, Ilya Gladyshev wrote:
> I like your idea of parallelizing these checks with async libpq API, thanks
> for working on it. The patch doesn't apply cleanly on master anymore, but
> I've rebased locally and taken it for a quick spin with a pg16 instance of
> 1000 empty databases. Didn't see any regressions with -j 1, there's some
> speedup with -j 8 (33 sec vs 8 sec for these checks).


Thanks for taking a look.  I'm hoping to do a round of polishing before
posting a rebased patch set soon.

> One thing that I noticed that could be improved is we could start a new
> connection right away after having run all query callbacks for the current
> connection in process_slot, instead of just returning and establishing the
> new connection only on the next iteration of the loop in async_task_run
> after potentially sleeping on select.

Yeah, we could just recursively call process_slot() right after freeing the
slot.  That'd at least allow us to avoid the spinning behavior as we run
out of databases to process, if nothing else.

> +1 to Jeff's suggestion that perhaps we could reuse connections, but perhaps
> that's a separate story.

When I skimmed through the various tasks, I didn't see a ton of
opportunities for further consolidation, or at least opportunities that
would help for upgrades from supported versions.  The data type checks are
already consolidated, for example.

-- 
nathan

Re: optimizing pg_upgrade's once-in-each-database steps

Reply via email to