Re: proposal: possibility to read dumped table's name from file

Tomas Vondra Tue, 13 Jul 2021 07:13:44 -0700



On 7/13/21 3:40 PM, Stephen Frost wrote:

Greetings,

* Tom Lane ([email protected]) wrote:

Alvaro Herrera <[email protected]> writes:

[1] your proposal of "[+-] OBJTYPE OBJIDENT" plus empty lines allowed
     plus lines starting with # are comments, seems plenty.  Any line not
     following that format would cause an error to be thrown.


I'd like to see some kind of keyword on each line, so that we could extend
the command set by adding new keywords.  As this stands, I fear we'd end
up using random punctuation characters in place of [+-], which seems
pretty horrid from a readability standpoint.


I agree that it'd end up being bad with single characters.

The [+-] format is based on what rsync does, so there's at least someprecedent for that, and IMHO it's fairly readable. I agree the rest ofthe rule (object type, ...) may be a bit more verbose.

I think that this file format should be designed with an eye to allowing
every, or at least most, pg_dump options to be written in the file rather
than on the command line.  I don't say we have to *implement* that right
now; but if the format spec is incapable of being extended to meet
requests like that one, I think we'll regret it.  This line of thought
suggests that the initial commands ought to match the existing
include/exclude switches, at least approximately.


I agree that we want to have an actual config file that allows just
about every pg_dump option.  I'm also fine with saying that we don't
have to implement that initially but the format should be one which can
be extended to allow that.

I understand the desire to have a config file that may contain allpg_dump options, but I really don't see why we'd want to mix that withthe file containing filter rules.

I think those should be separate, one of the reasons being that I findit desirable to be able to "include" the filter rules into differentpg_dump configs. That also means the format for the filter rules can bemuch simpler.

It's also not clear to me whether the single-file approach would allowfiltering not supported by actual pg_dump option, for example.

Hence I suggest

        include table PATTERN
        exclude table PATTERN

which ends up being the above but with words not [+-].

Work for me.

Which ends up inventing yet-another-file-format which people will end up
writing generators and parsers for.  Which is exactly what I was arguing
we really should be trying to avoid doing.

People will have to write generators *in any case* because how elsewould you use this? Unless we also provide tools to manipulate that file(which seems rather futile), they'll have to do that. Even if we usedJSON/YAML/TOML/... they'd still need to deal with the semantics of thefile format.

FWIW I don't understand why would they need to write parsers. That'ssomething we'd need to do to process the file. I think the case when thefilter file needs to be modified is rather rare - it certainly is notwhat the original use case Pavel tried to address needs. (I know thatcustomer and the filter would be generated and used for a single dump.)

My opinion is that the best solution (to make both generators andparsers simple) is to keep the format itself as simple as possible.Which is exactly why I'm arguing for only addressing the filtering, nottrying to invent a "universal" pg_dump config file format.

I definitely feel that we should have a way to allow anything that can
be created as an object in the database to be explicitly included in the
file and that means whatever we do need to be able to handle objects
that have names that span multiple lines, etc.  It's not clear how the
above would.  As I recall, the proposed patch didn't have anything for
handling that, which was one of the issues I had with it and is why I
bring it up again.

I really don't understand why you think the current format can't doescaping/quoting or handle names spanning multiple lines. The fact thatthe original patch did not handle that correctly is a bug, but it doesnot mean the format can't handle that.



regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: proposal: possibility to read dumped table's name from file

Reply via email to