Re: Test to dump and restore objects left behind by regression

Michael Paquier Mon, 03 Jun 2024 15:59:08 -0700

On Fri, Apr 26, 2024 at 06:38:22PM +0530, Ashutosh Bapat wrote:
> Some points for discussion:
> 1. The test still hardcodes the diffs between two dumps. Haven't found a
> better way to do it. I did consider removing the problematic objects from
> the regression database but thought against it since we would lose some
> coverage.
> 
> 2. The new code tests dump and restore of just the regression database and
> does not use pg_dumpall like pg_upgrade. Should it instead perform
> pg_dumpall? I decided against it since a. we are interested in dumping and
> restoring objects left behind by regression, b. I didn't find a way to
> provide the format option to pg_dumpall. The test could be enhanced to use
> different dump formats.
> 
> I have added it to the next commitfest.
> https://commitfest.postgresql.org/48/4956/


Ashutosh and I have discussed this patch a bit last week.  Here is a
short summary of my input, after I understood what is going on.
 
+       # We could avoid this by dumping the database loaded from original dump.
+       # But that would change the state of the objects as left behind by the
+       # regression.
+       my $expected_diff = " --
+ CREATE TABLE public.gtestxx_4 (
+-    b integer,
+-    a integer NOT NULL
++    a integer NOT NULL,
++    b integer
+ )
[...]
+       my ($stdout, $stderr) =
+               run_command([ 'diff', '-u', $dump4_file, $dump5_file]);
+       # Clear file names, line numbers from the diffs; those are not going to
+       # remain the same always. Also clear empty lines and normalize new line
+       # characters across platforms.
+       $stdout =~ s/^\@\@.*$//mg;
+       $stdout =~ s/^.*$dump4_file.*$//mg;
+       $stdout =~ s/^.*$dump5_file.*$//mg;
+       $stdout =~ s/^\s*\n//mg;
+       $stdout =~ s/\r\n/\n/g;
+       $expected_diff =~ s/\r\n/\n/g;
+       is($stdout, $expected_diff, 'old and new dumps match after dump and 
restore');
+}

I am not a fan of what this patch does, adding the knowledge related
to the dump filtering within 002_pg_upgrade.pl.  Please do not take
me wrong, I am not against the idea of adding that within this
pg_upgrade test to save from one full cycle of `make check` to check
the consistency of the dump.  My issue is that this logic should be
externalized, and it should be in fewer lines of code.

For the externalization part, Ashutosh and I considered a few ideas,
but one that we found tempting is to create a small .pm, say named
AdjustDump.pm.  This would share some rules with the existing
AdjustUpgrade.pm, which would be fine IMO even if there is a small
overlap, documenting the dependency between each module.  That makes
the integration with the buildfarm much simpler by not creating more
dependencies with the modules shared between core and the buildfarm
code.  For the "shorter" part, one idea that I had is to apply to the
dump a regexp that wipes out the column definitions within the
parenthesis, keeping around the CREATE TABLE and any other attributes
not impacted by the reordering.  All that should be documented in the
module, of course.

Another thing would be to improve the backend so as we are able to
a better support for physical column ordering, which would, I assume
(and correct me if I'm wrong!), prevent the reordering of the
attributes like in this inheritance case.  But that would not address
the case of dumps taken from older versions with a new version of
pg_dump, which is something that may be interesting to have more tests
for in the long-term.  Overall a module sounds like a better solution.
--
Michael

signature.asc
Description: PGP signature

Re: Test to dump and restore objects left behind by regression

Reply via email to