On Thu, Mar 06, 2025 at 03:20:16PM -0500, Andres Freund wrote: > There are many systems with hundreds of databases, removing all parallelism > for those from pg_upgrade would likely hurt way more than what we can gain > here.
I just did a quick test on a freshly analyzed database with 1,000 sequences and 10,000 tables with 1,000 rows and 2 unique constraints apiece. ~/pgdata$ time pg_dump postgres --no-data --binary-upgrade > /dev/null 0.29s user 0.09s system 21% cpu 1.777 total ~/pgdata$ time pg_dump postgres --no-data --no-statistics --binary-upgrade > /dev/null 0.14s user 0.02s system 25% cpu 0.603 total So about 1.174 seconds goes to statistics. Even if we do all sorts of work to make dumping statistics really fast, dumping 8 in succession would still take upwards of 4.8 seconds or more. Even with the current code, dumping 8 in parallel would probably take closer to 2 seconds, and I bet reducing the number of statistics queries could drive it below 1. Granted, I'm waving my hands vigorously with those last two estimates. That being said, I do think in-database parallelism would be useful in some cases. I frequently hear about problems with huge numbers of large objects on a cluster with one big database. But that's probably less likely than the many database case. -- nathan