Re: Test to dump and restore objects left behind by regression

Ashutosh Bapat Wed, 05 Jun 2024 04:40:27 -0700

On Tue, Jun 4, 2024 at 4:28 AM Michael Paquier <mich...@paquier.xyz> wrote:


> On Fri, Apr 26, 2024 at 06:38:22PM +0530, Ashutosh Bapat wrote:
> > Some points for discussion:
> > 1. The test still hardcodes the diffs between two dumps. Haven't found a
> > better way to do it. I did consider removing the problematic objects from
> > the regression database but thought against it since we would lose some
> > coverage.
> >
> > 2. The new code tests dump and restore of just the regression database
> and
> > does not use pg_dumpall like pg_upgrade. Should it instead perform
> > pg_dumpall? I decided against it since a. we are interested in dumping
> and
> > restoring objects left behind by regression, b. I didn't find a way to
> > provide the format option to pg_dumpall. The test could be enhanced to
> use
> > different dump formats.
> >
> > I have added it to the next commitfest.
> > https://commitfest.postgresql.org/48/4956/
>
> Ashutosh and I have discussed this patch a bit last week.  Here is a
> short summary of my input, after I understood what is going on.
>
> +       # We could avoid this by dumping the database loaded from original
> dump.
> +       # But that would change the state of the objects as left behind by
> the
> +       # regression.
> +       my $expected_diff = " --
> + CREATE TABLE public.gtestxx_4 (
> +-    b integer,
> +-    a integer NOT NULL
> ++    a integer NOT NULL,
> ++    b integer
> + )
> [...]
> +       my ($stdout, $stderr) =
> +               run_command([ 'diff', '-u', $dump4_file, $dump5_file]);
> +       # Clear file names, line numbers from the diffs; those are not
> going to
> +       # remain the same always. Also clear empty lines and normalize new
> line
> +       # characters across platforms.
> +       $stdout =~ s/^\@\@.*$//mg;
> +       $stdout =~ s/^.*$dump4_file.*$//mg;
> +       $stdout =~ s/^.*$dump5_file.*$//mg;
> +       $stdout =~ s/^\s*\n//mg;
> +       $stdout =~ s/\r\n/\n/g;
> +       $expected_diff =~ s/\r\n/\n/g;
> +       is($stdout, $expected_diff, 'old and new dumps match after dump
> and restore');
> +}
>
> I am not a fan of what this patch does, adding the knowledge related
> to the dump filtering within 002_pg_upgrade.pl.  Please do not take
> me wrong, I am not against the idea of adding that within this
> pg_upgrade test to save from one full cycle of `make check` to check
> the consistency of the dump.  My issue is that this logic should be
> externalized, and it should be in fewer lines of code.


> For the externalization part, Ashutosh and I considered a few ideas,
> but one that we found tempting is to create a small .pm, say named
> AdjustDump.pm.  This would share some rules with the existing
> AdjustUpgrade.pm, which would be fine IMO even if there is a small
> overlap, documenting the dependency between each module.  That makes
> the integration with the buildfarm much simpler by not creating more
> dependencies with the modules shared between core and the buildfarm
> code.  For the "shorter" part, one idea that I had is to apply to the
> dump a regexp that wipes out the column definitions within the
> parenthesis, keeping around the CREATE TABLE and any other attributes
> not impacted by the reordering.  All that should be documented in the
> module, of course.
>

Thanks for the suggestion. I didn't understand the dependency with the
buildfarm module. Will the new module be used in buildfarm separately? I
will work on this soon. Thanks for changing CF entry to WoA.


>
> Another thing would be to improve the backend so as we are able to
> a better support for physical column ordering, which would, I assume
> (and correct me if I'm wrong!), prevent the reordering of the
> attributes like in this inheritance case.  But that would not address
> the case of dumps taken from older versions with a new version of
> pg_dump, which is something that may be interesting to have more tests
> for in the long-term.  Overall a module sounds like a better solution.
>

Changing the physical order of column of a child table based on the
inherited table seems intentional as per MergeAttributes(). That logic
looks sane by itself. In binary mode pg_dump works very hard to retain the
column order by issuing UPDATE commands against catalog tables. I don't
think mimicking that behaviour is the right choice for non-binary dump. I
agree with your conclusion that we fix it in by fixing the diffs. The code
to do that will be part of a separate module.

--
Best Wishes,
Ashutosh Bapat

Re: Test to dump and restore objects left behind by regression

Reply via email to