On Tue, May 7, 2013 at 1:54 PM, Stephen Frost <sfr...@snowman.net> wrote: > I believe the point that Tom is making is that we shouldn't paint > ourselves into a corner by letting users provide old snapshots to > pg_dump which haven't acquired any of the necessary locks. The goal, at > least as I read it, is to come up with a workable design (and I don't > know that we have, but still) which provides a way for the locks to be > taken at least as quickly as what pg_dump does today and which we could > modify down the road to take the locks pre-snapshot (presuming we can > figure out a way to make that work).
One natural way to do it would be to make an option to pg_dump which caused it to do all the normal pre-dump things it would normally do, then export a snapshot and wait for the user. (Alternately it could even create a prepared transaction which iirc keeps the locks until it's committed). That gives users a way to get a snapshot that is guaranteed to work until that transaction is exited. But I don't think that would really make users happy. I think the usual use case for this feature would be to dump a single table or small number of tables as of some time in the past that they didn't plan in advance that they would need. They might have a cron job periodically export a snapshot "just in case" and then want to use it later. They wouldn't be happy if they had to create a prepared transaction for each such snapshot which locked every table in their database until they decide they don't actually need it. That would mean they could never do any ddl. The use case of wanting to dump a single table as of a few hours ago (e.g. before some application data loss bug) is pretty compelling. If we could do it it I think it would be worth quite a bit. What's the worst case for using an old snapshot? If I try to access a table that doesn't exist any longer I'll get an error. That doesn't really seem that bad for the use case I described. It's worse for the full table dump but for an explicit list of tables, eh. Seems ok to me. If I try to access a table whose schema has changed then I might use the wrong tupledesc and see rows that don't decode properly. That would be a disaster. Can we protect against that by noticing that the pg_class row isn't visible to our snapshot and throw an error? Would that be sufficient to protect against all schema changes? Would it cause massive false positives based on whether vacuum had happened to have run recently? -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers