Tom Lane wrote:
"Heikki Linnakangas" <[EMAIL PROTECTED]> writes:
At some point, I think we have to bite the bullet and find a way to use
multiple CPUs for a single load. I don't have any good ideas or plans
for that, but hopefully someone does.
As already mentioned upthread, we could do that today, with zero backend
changes, by making pg_restore drive multiple sessions. Now there are
scenarios where this wouldn't help too much --- eg, a database with only
one enormous table. We couldn't parallelize the loading of that table,
although we could parallelize creation of its indexes. But for an
example such as JD is complaining about, we ought to have no trouble
thrashing his disks into the ground ;-)
Bring it on ! but I would note that with the current pg_restore I was
not able to trash my disks. I only used four backends (I could have used
8) but even with that, I was only doing ~ 45M a second. So if I double,
I still have bandwidth. It would still be an huge improvement though.
What you would need for this is an -Fc or -Ft dump, because a plain
script output from pg_dump doesn't carry any dependency information,
I would focus on -Fc. With the limitations of -Ft this would be a good
way to start phasing -Ft out.
much less any index of where in the file different bits of data are.
Just armwaving, I envision a multiprocess pg_restore like this:
* one controller process that tracks the not-yet-loaded
TOC items in the dump
* N child processes that just transmit a selected TOC item
to a connected backend, and then ask the controller
what to do next
Most likely, the bottleneck with this sort of thing would be multiple
parallel reads from the pg_dump archive file. Possibly the controller
process could be taught to schedule COPY and CREATE INDEX operations
so that not too many processes are trying to read lots of archive
data at the same time.
A less hacker and more DBA bottleneck will be to limit the number of
backends being created for restore. We don't really want to have more
than one backend per CPU, otherwise we just start switching.
Sincerely,
Joshua D. Drake
---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at
http://www.postgresql.org/about/donate