Haribabu Kommi <kommi.harib...@gmail.com> writes: > [ pg_dump-and-pg_dumpall-database-handling-refactoring_v12.patch ]
I've gone through this in a once-over-lightly fashion. Since there was quite a bit of debate upthread about how things should work, I'd like to start by summarizing the decisions this patch has made, in case anyone still wants to object to them: * pg_dump will now be responsible for dumping all per-database properties, including "ALTER ROLE IN DATABASE SET" settings. To get that behavior, pg_dumpall will invoke pg_dump using -C, or using the new switch --set-db-properties when working on "template1" or "postgres" databases, since those should already exist in the target installation. * pg_dumpall still won't dump template0 (or any other not-datallowconn database). This means that it will no longer propagate any non-default per-database properties for such databases. I think this is fine for template0: if you've messed with that at all, you are fooling with non user-serviceable parts. It's a bit less good for other DBs maybe, but on the whole it seems more consistent to simply treat nonconnectable DBs as not existing. (That could stand to be documented though.) * "pg_dumpall -g" will now produce only role- and tablespace-related output. * There isn't any direct way to ask pg_dump for "just the DB properties and nothing else" (although I suppose you can mostly fake it by adding "--schema nosuchschema" or some such). This makes it inconvenient to exactly duplicate the old behavior of "pg_dumpall -g", if someone wanted to do that to fit into their existing backup procedures. I'm not sure how important that is. Personally I'm content to lose it, but if there's enough pushback maybe we could invent a "--db-properties-only" switch? There are some other points that haven't been debated: * As the patch stands, --set-db-properties is implicit when you specify -C, and in fact the patch goes to the trouble of throwing an error if you try to specify both switches. I'm inclined to think this might be a bad idea. What about saying that -C enables emitting CREATE DATABASE and reconnecting to that DB, and independently of that, --set-db-properties enables emitting the additional per-database properties? This seems simpler to understand, more flexible, and less of a change from the previous behavior of -C. On the other hand you could argue that people would always want --set-db-properties with -C and so we're merely promoting carpal tunnel (and errors of omission) if we do it like that. If so, maybe we could say "-C implies --set-db-properties by default, but if you really don't want that, you can say -C --no-set-db-properties". Perhaps the only application of this is to reproduce pg_dump's historical behavior, but that's probably of some value. * The patch fails to make any provision for deciding at pg_restore time whether to emit DB properties. Considering that you can use -C at restore time, I think it's pretty awful that you can't use --set-db-properties then. Moreover, not only has the patch not added a separate "DATABASE PROPERTIES" TOC entry type as was originally proposed, what it's actually doing is emitting two identically labeled "DATABASE" TOC entries, one with the CREATE and the other with the rest. That's simply horrid. I think we need to do this properly and emit two distinguishable TOC entries, and control which of them get printed on the restore side of the logic, not the TOC entry construction side. (I'm not sure at this point if we need an archive version bump to do that, but perhaps not --- in the past we've added new TOC types without a version bump.) * Some years ago, commit 4bd371f6f caused pg_dumpall to emit "SET default_transaction_read_only = off;" in its scripts, so that it could successfully handle DBs that have "SET default_transaction_read_only = on" as a database property. This was intentionally NOT done to pg_dump, arguing that doing so would greatly weaken such a setting as a defense against stupid errors (i.e. restoring into your production database). The patch currently ignores that reasoning and moves the setting into pg_dump anyway. Haribabu mentioned this point upthread and got basically no response, but I'm concerned about it. I think however that we might be able to dodge that issue as a byproduct of fixing the previous problem. If the CREATE DATABASE is done as one TOC entry, and we reconnect to the target DB after processing that entry, and then we process the DATABASE PROPERTIES entry, then any DB property settings we've installed will not apply in our existing session. So I think we can just leave out the "SET default_transaction_read_only = on" altogether. Obviously this puts a premium on not reconnecting, but reconnecting would break other more important features like --single-transaction, so I am not worried about that. * An issue in connection with that is that because you can't issue ALTER DATABASE SET TABLESPACE against the current DB, pg_dumpall currently takes the trouble to explicitly reconnect to a different DB and then back to the target DB when issuing that command. That doesn't play nice with my suggestion above, nor with --single-transaction. I'm still thinking about how to fix that, but probably the best fix involves handling non-default tablespaces by including them in the initial CREATE DATABASE command, plus emitting an ALTER DATABASE SET TABLESPACE in a separate new "DATABASE TABLESPACE" TOC object, which we would have to teach the pg_restore logic to handle correctly. At minimum it'd need to be aware about the interaction with --single-transaction; since we have to have that anyway, we could also make it do the extra-reconnections dance instead of hard-wiring \connect commands into the text of the entry, and/or skip the whole thing if it knows it emitted the CREATE DATABASE command. Comments? If there's not objections I plan to push forward on this. regards, tom lane