Hi team,

Hope you are all doing well.


Recently I have encountered a scenario that a user need to dump/restore a 
database with 1k ~ 2k large objects. In both dump and restore process, parallel 
mode was used with 10 processes. However, in both dump and restore processes, 
it seems only 1 process was used for LO.


Checking further on this issue, it seems that from PostgreSQL 17, there was a 
change on function getLOs. In function getLOs, blob information would be 
ordered by owner/ACL. For each same owner/ACL appear pair, we will group them 
in a ArchiveEntry for every MAX_BLOBS_PER_ARCHIVE_ENTRY entry. This brings 
parallelism in dump/restore if a user has millions of LOs.


However, if a user only has a limited number of LOs, like 1k, which seems 
sensible as LOs should be large. In this scenario, there would be only 1 
process work. Therefore, I'm proposing a change. Instead of using a fixed 
number to group LOs with same owner/ACL pair, we can use a SQL query to 
distribute each pair into a fixed number of batches. For each batch, it would 
be assigned an ArchiveEntry. So, the workload for each pair could be 
distributed into processes even if there are only few numbers of LO. 


For the fixed batch number, 50 seems to be a sensible value, as normal user 
would not use more threads than 50 in a dump/restore process.


I have tested in a cloud VM. For a database with 600 LOs (24GB), before this 
patch, it would take
    kaifan@kaifanvm:~/test$ time pg_dump -Fd -d test -f dump2 -j 10
    real    23m38.274s
    user    12m24.477s
    sys     0m50.456s


After the patch, it would take 
    kaifan@kaifanvm:~/test$ time pg_dump -Fd -d test -f dump3 -j 10
    real    7m50.295s
    user    16m55.940s
    sys     1m12.640s


As multiple processes are used. May I know if you have any thought on this?


Kai Fan

Attachment: 0001-add-parallel-lo.patch
Description: Binary data

Reply via email to