On Mon, Mar 29, 2010 at 4:11 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Josh Berkus <j...@agliodbs.com> writes: >> On 3/29/10 7:46 AM, Joachim Wieland wrote: >>> I actually assume that whenever people are interested >>> in a very fast dump, it is because they are doing some maintenance >>> task (like migrating to a different server) that involves pg_dump. In >>> these cases, they would stop their system anyway. > >> Actually, I'd say that there's a broad set of cases of people who want >> to do a parallel pg_dump while their system is active. Parallel pg_dump >> on a stopped system will help some people (for migration, particularly) >> but parallel pg_dump with snapshot cloning will help a lot more people. > > I doubt that. My thought about it is that parallel dump will suck > enough resources from the source server, both disk and CPU, that you > would never want to use it on a live production machine. Not even at > 2am. And your proposed use case is hardly a "broad set" in any case. > Thus, Joachim's approach seems perfectly sane from here. I certainly > don't see that there's an argument for spending 10x more development > effort to pick up such use cases. > > Another question that's worth asking is exactly what the use case would > be for parallel pg_dump against a live server, whether the snapshots are > synchronized or not. You will not be able to use that dump as a basis > for PITR, so there is no practical way of incorporating any changes that > occur after the dump begins. So what are you making it for? If it's a > routine backup for disaster recovery, fine, but it's not apparent why > you want max speed and to heck with live performance for that purpose. > I think migration to a new server version (that's too incompatible for > PITR or pg_migrate migration) is really the only likely use case.
It's completely possible that you could want to clone a server for dev and have more CPU and I/O bandwidth available than can be efficiently used by a non-parallel pg_dump. But certainly what Joachim is talking about will be a good start. I think there is merit to the synchronized snapshot stuff for pg_dump and perhaps other applications as well, but I think Joachim's (well-taken) point is that we don't have to treat it as a hard prerequisite. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers