Hi team,
Hope you are all doing well. Recently I have encountered a scenario that a user need to dump/restore a database with 1k ~ 2k large objects. In both dump and restore process, parallel mode was used with 10 processes. However, in both dump and restore processes, it seems only 1 process was used for LO. Checking further on this issue, it seems that from PostgreSQL 17, there was a change on function getLOs. In function getLOs, blob information would be ordered by owner/ACL. For each same owner/ACL appear pair, we will group them in a ArchiveEntry for every MAX_BLOBS_PER_ARCHIVE_ENTRY entry. This brings parallelism in dump/restore if a user has millions of LOs. However, if a user only has a limited number of LOs, like 1k, which seems sensible as LOs should be large. In this scenario, there would be only 1 process work. Therefore, I'm proposing a change. Instead of using a fixed number to group LOs with same owner/ACL pair, we can use a SQL query to distribute each pair into a fixed number of batches. For each batch, it would be assigned an ArchiveEntry. So, the workload for each pair could be distributed into processes even if there are only few numbers of LO. For the fixed batch number, 50 seems to be a sensible value, as normal user would not use more threads than 50 in a dump/restore process. I have tested in a cloud VM. For a database with 600 LOs (24GB), before this patch, it would take kaifan@kaifanvm:~/test$ time pg_dump -Fd -d test -f dump2 -j 10 real 23m38.274s user 12m24.477s sys 0m50.456s After the patch, it would take kaifan@kaifanvm:~/test$ time pg_dump -Fd -d test -f dump3 -j 10 real 7m50.295s user 16m55.940s sys 1m12.640s As multiple processes are used. May I know if you have any thought on this? Kai Fan
0001-add-parallel-lo.patch
Description: Binary data