On Wed, Jul 31, 2024 at 10:55:33PM +0100, Ilya Gladyshev wrote: > I like your idea of parallelizing these checks with async libpq API, thanks > for working on it. The patch doesn't apply cleanly on master anymore, but > I've rebased locally and taken it for a quick spin with a pg16 instance of > 1000 empty databases. Didn't see any regressions with -j 1, there's some > speedup with -j 8 (33 sec vs 8 sec for these checks).
Thanks for taking a look. I'm hoping to do a round of polishing before posting a rebased patch set soon. > One thing that I noticed that could be improved is we could start a new > connection right away after having run all query callbacks for the current > connection in process_slot, instead of just returning and establishing the > new connection only on the next iteration of the loop in async_task_run > after potentially sleeping on select. Yeah, we could just recursively call process_slot() right after freeing the slot. That'd at least allow us to avoid the spinning behavior as we run out of databases to process, if nothing else. > +1 to Jeff's suggestion that perhaps we could reuse connections, but perhaps > that's a separate story. When I skimmed through the various tasks, I didn't see a ton of opportunities for further consolidation, or at least opportunities that would help for upgrades from supported versions. The data type checks are already consolidated, for example. -- nathan