Re: proposal: possibility to read dumped table's name from file

Pavel Stehule Sat, 12 Nov 2022 12:36:56 -0800

pá 11. 11. 2022 v 9:11 odesílatel Julien Rouhaud <rjuju...@gmail.com>
napsal:


> Hi,
>
> On Sat, Nov 05, 2022 at 08:54:57PM +0100, Pavel Stehule wrote:
> >
> > pá 4. 11. 2022 v 14:28 odesílatel Julien Rouhaud <rjuju...@gmail.com>
> > napsal:
> >
> > > > But one thing I noticed is that "optarg" looks wrong here:
> > > >
> > > > simple_string_list_append(&opts->triggerNames, optarg);
> > >
> > > Ah indeed, good catch!  Maybe there should be an explicit test for
> every
> > > (include|exclude) / objtype combination?  It would be a bit verbose
> (and
> > > possibly hard to maintain).
> > >
> >
> > yes - pg_restore is not well covered by  tests, fixed
> >
> > I found another issue. The pg_restore requires a full signature of the
> > function and it is pretty sensitive on white spaces (pg_restore).
>
> Argh, indeed.  It's a good thing to have expanded the regression tests :)
>
> > I made a
> > mistake when I partially parsed patterns like SQL identifiers. It can
> work
> > for simple cases, but when I parse the function's signature it stops
> > working. So I rewrote the parsing pattern part. Now, I just read an input
> > string and I try to reduce spaces. Still multiline identifiers are
> > supported. Against the previous method of pattern parsing, I needed to
> > change just one regress test - now I am not able to detect garbage after
> > pattern :-/.
>
> I'm not sure it's really problematic.  It looks POLA-violation compatible
> with
> regular pg_dump options, for instance:
>
> $ echo "include table t1()" | pg_dump --filter - | ag CREATE
> CREATE TABLE public.t1 (
>
> $ pg_dump -t "t1()" | ag CREATE
> CREATE TABLE public.t1 (
>
> $ echo "include table t1()blabla" | pg_dump --filter - | ag CREATE
> pg_dump: error: no matching tables were found
>
> $ pg_dump -t "t1()blabla" | ag CREATE
> pg_dump: error: no matching tables were found
>
> I don't think the file parsing code should try to be smart about checking
> the
> found patterns.
>
> > * function's filtering doesn't support schema - when the name of function
> > is specified with schema, then the function is not found
>
> Ah I didn't know that.  Indeed it only expect a non-qualified identifier,
> and
> would restore any function that matches the name (and arguments), possibly
> multiple ones if there are variants in different schema.  That's unrelated
> to
> this patch though.
>
> > * the function has to be specified with an argument type list - the
> > separator has to be exactly ", " string. Without space or with one space
> > more, the filtering doesn't work (new implementation of pattern parsing
> > reduces white spaces sensitivity). This is not a bug, but it is not well
> > documented.
>
> Agreed.
>
> > attached updated patch
>
> It looks overall good to me!  I just have a few minor nitpicking
> complaints:
>
> - you removed the pg_strip_clrf() calls and declared everything as "const
> char
>   *", so there's no need to explicitly cast the filter_get_keyword()
> arguments
>   anymore
>

removed


>
> Note also that the code now relies on the fact that there are some non-zero
> bytes after a pattern to know that no errors happened.  It's not a problem
> as
> you should find an EOF marker anyway if CLRF were stripped.
>

I am not sure if I understand this note well?


>
> + * Following routines are called from pg_dump, pg_dumpall and pg_restore.
> + * Unfortunately, implementation of exit_nicely in pg_dump and pg_restore
> + * is different from implementation of this routine in pg_dumpall. So
> instead
> + * of directly calling exit_nicely we have to return some error flag (in
> this
> + * case NULL), and exit_nicelly will be executed from caller's routine.
>
> Slight improvement:
> [...]
> Unfortunately, the implementation of exit_nicely in pg_dump and pg_restore
> is
> different from the one in pg_dumpall, so instead of...
>
> + * read_pattern - reads an pattern from input. The pattern can be mix of
> + * single line or multi line subpatterns. Single line subpattern starts
> first
> + * non white space char, and ending last non space char on line or by char
> + * '#'. The white spaces inside are removed (around char ".()"), or
> reformated
> + * around char ',' or reduced (the multiple spaces are replaced by one).
> + * Multiline subpattern starts by double quote and ending by this char
> too.
> + * The escape rules are same like for SQL quoted literal.
> + *
> + * Routine signalizes error by returning NULL. Otherwise returns pointer
> + * to next char after last processed char in input string.
>
>
> typo: reads "a" pattern from input...
>

fixed


>
> I don't fully understand the part about subpatterns, but is that necessary
> to
> describe it?  Simply saying that any valid and possibly-quoted identifier
> can
> be parsed should make it clear that identifiers containing \n characters
> should
> work too.  Maybe also just mention that whitespaces are removed and special
> care is taken to output routines in exactly the same way calling code will
> expect it (that is comma-and-single-space type delimiter).
>

In this case I hit the limits of my English language skills.

I rewrote this comment, but it needs more care. Please, can you look at it?

Re: proposal: possibility to read dumped table's name from file

Reply via email to