As I work through the extensions patch, the aspect of it that I like the least is the NO USER DATA clause and related functions. I think it's badly designed, badly implemented, and doesn't solve the problem. If we accept it as-is, we'll soon have to rework it, and at that point we'll be left with a large backwards-compatibility issue.
To recap: the problem we need to solve is the possibility that an extension has configuration tables whose contents might get altered within a particular database; if so, we need to have pg_dump dump out those alterations so that the configuration will be preserved. This isn't just an academic issue; contrib/tsearch2 for example had tables exactly like that, before it got absorbed into core, and I believe PostGIS has such today. The solution offered by the submitted patch looks like this: 1. The CREATE EXTENSION command has an optional clause NO USER DATA, which users are not supposed to include when initially loading an extension, but which pg_dump will always include when emitting CREATE EXTENSION in dump scripts. 2. There is a function pg_extension_with_user_data() that an extension's SQL script can call to detect whether NO USER DATA was specified. 3. There is a function pg_extension_flag_dump(oid) that an extension's SQL script can call to mark a created object as (sort of) not part of the extension after all, so that pg_dump will dump that object anyway. The way that you'd use these features to solve the problem is to make the extension's SQL script flag the configuration table for dumping, and also avoid creating the table or loading any predefined entries into it if NO USER DATA is active. (Which seems to mean that that option is named backwards, but I digress.) IMO this approach has got numerous serious problems: 1. Because the flag operates at the whole-SQL-object level, the only way to have a configuration table is to exclude the whole table definition from being treated as part of the extension. This makes it impossible to upgrade the table definition (eg, add columns or indexes) when installing a new version of the extension. This also makes it really easy to break the extension during selective restores ... there's no assurance the table will get created at all. 2. Also, because we're working at the whole-SQL-object level, there's no good way to deal with the possibility that the configuration table contains some rows created by the extension itself and others added by the user. All such rows will get dumped and reloaded separately from the extension, meaning there's no way to upgrade the extension-provided initial data either. 3. Use of pg_extension_with_user_data() requires the extension SQL script to contain conditional logic, which is not easy unless you want to assume plpgsql is available. Which extensions really should not do. (Once again: plpgsql may be installed by default, but that doesn't mean it's guaranteed to be present.) 4. If the SQL script uses pg_extension_with_user_data() to decide not to create the table, then it has no way to apply pg_extension_flag_dump, with the result that after a dump and reload cycle, the configuration table doesn't appear to be related to the extension at all. Thus the user has another way to break the extension, which is to accidentally drop the configuration table. So basically I think this approach doesn't work, can't be made to work, and imposes unreasonable burdens on extension authors anyway. If we're going to support user-modifiable configuration tables, we need to do it a different way. After a bit of thought I believe that we can fix this if we are willing to teach pg_dump explicitly about extension configuration tables. The behavior we want for those is for the table schema definition to never be dumped (the table should always be created by CREATE EXTENSION), but for some subset of the table data to get dumped, excluding any system-provided rows. An extension that wants to make use of that ability would probably need to add a boolean column "system_data" or something similar to its configuration tables. Having done so, the extension's SQL script could call a function that identifies the configuration table and tells how to decide which rows to dump, along the lines of pg_extension_partial_dump (table_name regclass, where_condition text) for example, SELECT pg_extension_partial_dump('my_table', 'WHERE NOT system_data'); I'm envisioning that this function would add that information to the pg_extension entry, and then pg_dump would use it to select out a subset of rows that it would dump as user data, even though it's not dumping the table definition. One interesting point is that for this to work during pg_upgrade, we'd need to include those user-data rows in the upgrade dump, otherwise they'd get lost. So at least for pg_upgrade, those rows need to be considered schema not data. I'm not sure if we should handle them that way all the time or whether pg_dump should special-case this for binary upgrades. I've not attempted to implement this idea, but offhand it looks like it would take a day or two's work, which seems well worthwhile to expend now in the hope of getting this feature right the first time. Comments? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers