On 12/02/2010 11:44 PM, Joachim Wieland wrote:
On Thu, Dec 2, 2010 at 9:33 PM, Tom Lane<t...@sss.pgh.pa.us>  wrote:
In particular, this issue *has* been discussed before, and there was a
consensus that preserving dump consistency was a requirement.  I don't
think that Joachim gets to bypass that decision just by submitting a
patch that ignores it.
I am not trying to bypass anything here :)  Regarding the locking
issue I probably haven't done sufficient research, at least I managed
to miss the emails that mentioned it. Anyway, that seems to be solved
now fortunately, I'm going to implement your idea over the weekend.

Regarding snapshot cloning and dump consistency, I brought this up
already several months ago and asked if the feature is considered
useful even without snapshot cloning. And actually it was you who
motivated me to work on it even without having snapshot consistency...

http://archives.postgresql.org/pgsql-hackers/2010-03/msg01181.php

In my patch pg_dump emits a warning when called with -j, if you feel
better with an extra option
--i-know-that-i-have-no-synchronized-snapshots, fine with me :-)

In the end we provide a tool with limitations, it might not serve all
use cases but there are use cases that would benefit a lot. I
personally think this is better than to provide no tool at all...




I think Tom's statement there:

I think migration to a new server version (that's too incompatible for
PITR or pg_migrate migration) is really the only likely use case.

is just wrong. Say you have a site that's open 24/7. But there is a window of, say, 6 hours, each day, when it's almost but not quite quiet. You want to be able to make your disaster recovery dump within that window, and the low level of traffic means you can afford the degraded performance that might result from a parallel dump. Or say you have a hot standby machine from which you want to make the dump but want to set the max_standby_*_delay as low as possible. These are both cases where you might want parallel dump and yet you want dump consistency. I have a client currently considering the latter setup, and the timing tolerances are a little tricky. The times in which the system is in a state that we want dumped are fixed, and we want to be sure that the dump is finished by the next time such a time rolls around. (This is a system that in effect makes one giant state change at a time.) If we can't complete the dump in that time then there will be a delay introduced to the system's critical path. Parallel dump will be very useful in helping us avoid such a situation, but only if it's properly consistent.

I think Josh Berkus' comments in the thread you mentioned are correct:

Actually, I'd say that there's a broad set of cases of people who want
to do a parallel pg_dump while their system is active.  Parallel pg_dump
on a stopped system will help some people (for migration, particularly)
but parallel pg_dump with snapshot cloning will help a lot more people.



cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to