[HACKERS] parallel pg_restore

Andrew Dunstan Sun, 21 Sep 2008 12:29:41 -0700

I am working on getting parallel pg_restore working. I'm currentlygetting all the scaffolding working, and hope to have a naive prototypeposted within about a week.

The major question is how to choose the restoration order so as tomaximize efficiency both on the server and in reading the archive. Mythoughts are currently running something like this:


   * when an item is completed, reduce the dependency count for each
     item that depends on it by 1.
   * when an item has a dependency count of 0 it is available for
     execution, and gets moved to the head of the queue.
   * when a new worker spot becomes available, if there not currently a
     data load running then pick the first available data load,
     otherwise pick the first available item.

This would mean that loading a table would probably be immediatelyfollowed by creation of its indexes, including PK and UNIQUEconstraints, thus taking possible advantage of synchronised scans, datain file system buffers, etc.Another question is what we should do if the user supplies an explicitorder with --use-list. I'm inclined to say we should stick strictly withthe supplied order. Or maybe that should be an option.


Thoughts and comments welcome.

cheers

andrew






--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] parallel pg_restore

Reply via email to