On Thu, Nov 14, 2024 at 4:16 PM Ashutosh Bapat <ashutosh.bapat....@gmail.com> wrote: > > > But see next > > > > > > > > What's the advantage of testing all the formats? Would that stuff > > > have been able to catch up more issues related to specific format(s) > > > when it came to the compression improvements with inheritance? > > > > I haven't caught any more issues with formats other than "plain". It > > is more for future-proof testing. I am fine if we want to test just > > plain dump format for now. Adding more formats would be easier if > > required. > > Not done for now. Given that the 'directory' formats dumps the tables > in separate directories, and thus has some impact on how child tables > would be dumped and restored, I think we should at least have plain > and directory tested in this test. But I will wait for other opinion > before removing formats other than plain.
I gave this another thought. Looking at the documentation [1], each format does something different that affects the way objects are dumped and restored. Eliminating one or the other means we lose corresponding coverage in dump or restore. So I have left this untouched again. > > Interestingly, I have caught a new difference in dump from original > and restored database. See the difference between attached plain dump > files. I will start a new thread to see if this difference is > legitimate. Had this test been part of core, we would have caught it > earlier. > > Because of this difference, the test is failing. I will wait for the > conclusion on the other thread before adding more adjustments. > The new test uncovered an issue related to NOT NULL constraints [2]. We have committed a fix for that bug. So far this test has unearthed two bugs in committed changes in just one year. That proves the worth of this test. There are many projects, in flight, which implement new objects or new states of existing objects. I think this test will help in all those projects. I have rebased my patches on the current HEAD. The test now passes and does not show any new diff or bug. Squashed all the patches into one. While rebasing I found that 002_compare_backups has changed the way it compares dumps slightly. I have left it outside of this patch right now. > > I am not against the other suggestions to make the functions, code > added by this patch more general and extensible. But without an > example or case for such generalization and/or extensibility, it's > hard to get it right. And the functions and code are isolated enough > that we could generalize and extend them if the need arises. We can work on extending this further after the basic test is committed. But if we delay committing the test for the extensibility we might lose another bug. [1] https://www.postgresql.org/docs/current/app-pgdump.html [2] https://www.postgresql.org/message-id/caexhw5tbdgakdfqjdj-7fk6pjthg8d4zuf6fq4h2pq8zk38...@mail.gmail.com -- Best Wishes, Ashutosh Bapat
From a70b42bb88e7c885a67913b67f630c2e2ea6faa5 Mon Sep 17 00:00:00 2001 From: Ashutosh Bapat <ashutosh.ba...@enterprisedb.com> Date: Thu, 27 Jun 2024 10:03:53 +0530 Subject: [PATCH] Test pg_dump/restore of regression objects 002_pg_upgrade.pl tests pg_upgrade of the regression database left behind by regression run. Modify it to test dump and restore of the regression database as well. Regression database created by regression run contains almost all the database objects supported by PostgreSQL in various states. Hence the new testcase covers dump and restore scenarios not covered by individual dump/restore cases. Many regression tests mention tht they leave objects behind for dump/restore testing. But till now 002_pg_upgrade only tested dump/restore through pg_upgrade which is different from dump/restore through pg_dump. Adding the new testcase closes that gap. Testing dump and restore of regression database makes this test run longer for a relatively smaller benefit. Hence run it only when explicitly requested by user by specifying "regress_dump_test" in PG_TEST_EXTRA. Note For the reviewer: The new test has uncovered two bugs so far in one year. 1. Introduced by 14e87ffa5c54. Fixed in fd41ba93e4630921a72ed5127cd0d552a8f3f8fc. 2. Introduced by 0413a556990ba628a3de8a0b58be020fd9a14ed0. Reverted in 74563f6b90216180fc13649725179fc119dddeb5. Multiple tests compare pg_dump outputs taken from two clusters in plain format as a way to compare the contents of those clusters. Add PostreSQL::Test::Utils::compare_dumps() to standardize and modularize the comparison. Author: Ashutosh Bapat Reviewed by: Michael Pacquire, Tom Lane Discussion: https://www.postgresql.org/message-id/CAExHW5uF5V=Cjecx3_Z=7xfh4rg2wf61pt+hfquzjbqourz...@mail.gmail.com --- doc/src/sgml/regress.sgml | 11 ++ src/bin/pg_upgrade/t/002_pg_upgrade.pl | 145 +++++++++++++++++--- src/test/perl/Makefile | 2 + src/test/perl/PostgreSQL/Test/AdjustDump.pm | 122 ++++++++++++++++ src/test/perl/PostgreSQL/Test/Utils.pm | 48 +++++++ src/test/perl/meson.build | 1 + src/test/recovery/t/027_stream_regress.pl | 14 +- 7 files changed, 315 insertions(+), 28 deletions(-) create mode 100644 src/test/perl/PostgreSQL/Test/AdjustDump.pm diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml index f4cef9e80f7..4be5d2d7d52 100644 --- a/doc/src/sgml/regress.sgml +++ b/doc/src/sgml/regress.sgml @@ -336,6 +336,17 @@ make check-world PG_TEST_EXTRA='kerberos ldap ssl load_balance libpq_encryption' </para> </listitem> </varlistentry> + + <varlistentry> + <term><literal>regress_dump_test</literal></term> + <listitem> + <para> + When enabled, <filename>src/bin/pg_upgrade/t/002_pg_upgrade.pl</filename> + tests dump and restore of regression database left behind by the + regression run. Not enabled by default because it is time consuming. + </para> + </listitem> + </varlistentry> </variablelist> Tests for features that are not supported by the current build diff --git a/src/bin/pg_upgrade/t/002_pg_upgrade.pl b/src/bin/pg_upgrade/t/002_pg_upgrade.pl index 82a82a1841a..42b68527146 100644 --- a/src/bin/pg_upgrade/t/002_pg_upgrade.pl +++ b/src/bin/pg_upgrade/t/002_pg_upgrade.pl @@ -6,13 +6,13 @@ use warnings FATAL => 'all'; use Cwd qw(abs_path); use File::Basename qw(dirname); -use File::Compare; -use File::Find qw(find); -use File::Path qw(rmtree); +use File::Find qw(find); +use File::Path qw(rmtree); use PostgreSQL::Test::Cluster; use PostgreSQL::Test::Utils; use PostgreSQL::Test::AdjustUpgrade; +use PostgreSQL::Test::AdjustDump; use Test::More; # Can be changed to test the other modes. @@ -36,9 +36,9 @@ sub generate_db "created database with ASCII characters from $from_char to $to_char"); } -# Filter the contents of a dump before its use in a content comparison. -# This returns the path to the filtered dump. -sub filter_dump +# Filter the contents of a dump before its use in a content comparison for +# upgrade testing. This returns the path to the filtered dump. +sub filter_dump_for_upgrade { my ($is_old, $old_version, $dump_file) = @_; my $dump_contents = slurp_file($dump_file); @@ -262,6 +262,20 @@ else } } is($rc, 0, 'regression tests pass'); + + # Test dump/restore of the objects left behind by regression. Ideally it + # should be done in a separate TAP test, but doing it here saves us one full + # regression run. + # + # This step takes several extra seconds. Do it only when requested so as to + # avoid spending those extra seconds in every check-world run. + # + # Do this while the old cluster is running before the upgrade. + if ( $ENV{PG_TEST_EXTRA} + && $ENV{PG_TEST_EXTRA} =~ /\bregress_dump_test\b/) + { + test_regression_dump_restore($oldnode, %node_params); + } } # Initialize a new node for the upgrade. @@ -511,24 +525,115 @@ push(@dump_command, '--extra-float-digits', '0') $newnode->command_ok(\@dump_command, 'dump after running pg_upgrade'); # Filter the contents of the dumps. -my $dump1_filtered = filter_dump(1, $oldnode->pg_version, $dump1_file); -my $dump2_filtered = filter_dump(0, $oldnode->pg_version, $dump2_file); +my $dump1_filtered = + filter_dump_for_upgrade(1, $oldnode->pg_version, $dump1_file); +my $dump2_filtered = + filter_dump_for_upgrade(0, $oldnode->pg_version, $dump2_file); # Compare the two dumps, there should be no differences. -my $compare_res = compare($dump1_filtered, $dump2_filtered); -is($compare_res, 0, 'old and new dumps match after pg_upgrade'); +compare_dumps($dump1_filtered, $dump2_filtered, + 'old and new dumps match after pg_upgrade'); + +# Test dump and restore of objects left behind by the regression run. +# +# It is expected that regression tests, which create `regression` database, are +# run on `src_node`, which in turn is left in running state. The dump is +# restored on a fresh node created using given `node_params`. Plain dumps from +# both the nodes are compared to make sure that all the dumped objects are +# restored faithfully. +sub test_regression_dump_restore +{ + my ($src_node, %node_params) = @_; + my $dst_node = PostgreSQL::Test::Cluster->new('dst_node'); + + # Dump the original database for comparison later. + my $src_dump = get_dump_for_comparison($src_node->connstr('regression'), + 'src_dump', 1); + + # Setup destination database + $dst_node->init(%node_params); + $dst_node->start; + + for my $format ('plain', 'tar', 'directory', 'custom') + { + my $dump_file = "$tempdir/regression_dump.$format"; + my $format_spec = substr($format, 0, 1); + my $restored_db = 'regression_' . $format; + + # Even though we compare only schema from the original and the restored + # database (See get_dump_for_comparison() for details.), we dump and + # restore data as well to catch any errors while doing so. + command_ok( + [ + 'pg_dump', "-F$format_spec", '--no-sync', + '-d', $src_node->connstr('regression'), + '-f', $dump_file + ], + "pg_dump on source instance in $format format"); + + $dst_node->command_ok([ 'createdb', $restored_db ], + "created destination database '$restored_db'"); + + # Restore into destination database. + my @restore_command; + if ($format eq 'plain') + { + # Restore dump in "plain" format using `psql`. + @restore_command = [ + 'psql', '-d', $dst_node->connstr($restored_db), + '-f', $dump_file + ]; + } + else + { + @restore_command = [ + 'pg_restore', '-d', + $dst_node->connstr($restored_db), $dump_file + ]; + } + command_ok(@restore_command, + "restore dump taken in $format format on destination instance"); + + # Dump restored database for comparison + my $dst_dump = + get_dump_for_comparison($dst_node->connstr($restored_db), + 'dest_dump.' . $format, 0); + + compare_dumps($src_dump, $dst_dump, + "dump outputs of original and restored regression database, using $format format match" + ); + } +} -# Provide more context if the dumps do not match. -if ($compare_res != 0) +# Dump database pointed by given connection string in plain format and adjust it +# to compare dumps from original and restored database. +# +# file_prefix is used to create unique names for all dump files, so that they +# remain available for debugging in case the test fails. +# +# The name of the file containting adjusted dump is returned. +sub get_dump_for_comparison { - my ($stdout, $stderr) = - run_command([ 'diff', '-u', $dump1_filtered, $dump2_filtered ]); - print "=== diff of $dump1_filtered and $dump2_filtered\n"; - print "=== stdout ===\n"; - print $stdout; - print "=== stderr ===\n"; - print $stderr; - print "=== EOF ===\n"; + my ($connstr, $file_prefix, $adjust_child_columns) = @_; + + my $dumpfile = $tempdir . '/' . $file_prefix . '.sql'; + my $dump_adjusted = "${dumpfile}_adjusted"; + + + # The order of columns in COPY statements dumped from the original database + # and that from the restored database differs. These differences are hard to + # adjust. Hence we compare only schema dumps for now. + command_ok( + [ 'pg_dump', '-s', '--no-sync', '-d', $connstr, '-f', $dumpfile ], + 'dump for comparison succeeded'); + + open(my $dh, '>', $dump_adjusted) + || die "opening $dump_adjusted "; + print $dh adjust_regress_dumpfile(slurp_file($dumpfile), + $adjust_child_columns); + close($dh); + + return $dump_adjusted; } done_testing(); diff --git a/src/test/perl/Makefile b/src/test/perl/Makefile index c02f18454e3..91235204c7a 100644 --- a/src/test/perl/Makefile +++ b/src/test/perl/Makefile @@ -26,6 +26,7 @@ install: all installdirs $(INSTALL_DATA) $(srcdir)/PostgreSQL/Test/Cluster.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/Cluster.pm' $(INSTALL_DATA) $(srcdir)/PostgreSQL/Test/BackgroundPsql.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/BackgroundPsql.pm' $(INSTALL_DATA) $(srcdir)/PostgreSQL/Test/AdjustUpgrade.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/AdjustUpgrade.pm' + $(INSTALL_DATA) $(srcdir)/PostgreSQL/Test/AdjustDump.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/AdjustDump.pm' $(INSTALL_DATA) $(srcdir)/PostgreSQL/Version.pm '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Version.pm' uninstall: @@ -36,6 +37,7 @@ uninstall: rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/Cluster.pm' rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/BackgroundPsql.pm' rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/AdjustUpgrade.pm' + rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Test/AdjustDump.pm' rm -f '$(DESTDIR)$(pgxsdir)/$(subdir)/PostgreSQL/Version.pm' endif diff --git a/src/test/perl/PostgreSQL/Test/AdjustDump.pm b/src/test/perl/PostgreSQL/Test/AdjustDump.pm new file mode 100644 index 00000000000..0b0abb0cefc --- /dev/null +++ b/src/test/perl/PostgreSQL/Test/AdjustDump.pm @@ -0,0 +1,122 @@ + +# Copyright (c) 2024-2025, PostgreSQL Global Development Group + +=pod + +=head1 NAME + +PostgreSQL::Test::AdjustDump - helper module for dump and restore tests + +=head1 SYNOPSIS + + use PostgreSQL::Test::AdjustDump; + + # Adjust contents of dump output file so that dump output from original + # regression database and that from the restored regression database match + $dump = adjust_regress_dumpfile($dump, $original); + +=head1 DESCRIPTION + +C<PostgreSQL::Test::AdjustDump> encapsulates various hacks needed to +compare the results of dump and retore tests + +=cut + +package PostgreSQL::Test::AdjustDump; + +use strict; +use warnings FATAL => 'all'; + +use Exporter 'import'; +use Test::More; + +our @EXPORT = qw( + adjust_regress_dumpfile +); + +=pod + +=head1 ROUTINES + +=over + +=item $dump = adjust_regress_dumpfile($dump, $original) + +If we take dump of the regression database left behind after running regression +tests, restore the dump, and take dump of the restored regression database, the +outputs of both the dumps differ. Some regression tests purposefully create +some child tables in such a way that their column orders differ from column +orders of their respective parents. In the restored database, however, their +column orders are same as that of their respective parents. Thus the column +orders of these child tables in the original database and those in the restored +database differ, causing difference in the dump outputs. See MergeAttributes() +and dumpTableSchema() for details. + +This routine rearranges the column declarations in these C<CREATE TABLE ... INHERITS> +statements in the dump file from original database to match that from the +restored database. + +Additionally it adjusts blank and new lines to avoid noise. + +Arguments: + +=over + +=item C<dump>: Contents of dump file + +=item C<adjust_child_columns>: 1 indicates that the given dump file requires +adjusting columns in the child tables; usually when the dump is from original +database. 0 indicates no such adjustment is needed; usually when the dump is +from restored database. + +=back + +Returns the adjusted dump text. + +=cut + +sub adjust_regress_dumpfile +{ + my ($dump, $adjust_child_columns) = @_; + + # use Unix newlines + $dump =~ s/\r\n/\n/g; + # Suppress blank lines, as some places in pg_dump emit more or fewer. + $dump =~ s/\n\n+/\n/g; + + # Adjust the CREATE TABLE ... INHERITS statements. + if ($adjust_child_columns) + { + my $saved_dump = $dump; + + $dump =~ s/(^CREATE\sTABLE\sgenerated_stored_tests\.gtestxx_4\s\() + (\n\s+b\sinteger), + (\n\s+a\sinteger\sNOT\sNULL)/$1$3,$2/mgx; + + ok($saved_dump ne $dump, 'applied gtestxx_4 adjustments'); + + $dump =~ s/(^CREATE\sTABLE\spublic\.test_type_diff2_c1\s\() + (\n\s+int_four\sbigint), + (\n\s+int_eight\sbigint), + (\n\s+int_two\ssmallint)/$1$4,$2,$3/mgx; + + ok($saved_dump ne $dump, 'applied test_type_diff2_c1 adjustments'); + + $dump =~ s/(CREATE\sTABLE\spublic\.test_type_diff2_c2\s\() + (\n\s+int_eight\sbigint), + (\n\s+int_two\ssmallint), + (\n\s+int_four\sbigint)/$1$3,$4,$2/mgx; + + ok($saved_dump ne $dump, 'applied test_type_diff2_c2 adjustments'); + } + + return $dump; +} + +=pod + +=back + +=cut + +1; diff --git a/src/test/perl/PostgreSQL/Test/Utils.pm b/src/test/perl/PostgreSQL/Test/Utils.pm index 022b44ba22b..6efe5faf77d 100644 --- a/src/test/perl/PostgreSQL/Test/Utils.pm +++ b/src/test/perl/PostgreSQL/Test/Utils.pm @@ -50,6 +50,7 @@ use Cwd; use Exporter 'import'; use Fcntl qw(:mode :seek); use File::Basename; +use File::Compare; use File::Find; use File::Spec; use File::stat qw(stat); @@ -89,6 +90,8 @@ our @EXPORT = qw( command_fails_like command_checks_all + compare_dumps + $windows_os $is_msys2 $use_unix_sockets @@ -1081,6 +1084,51 @@ sub command_checks_all =pod +=item compare_dumps(dump1, dump2, testname) + +Test that the given two files match. The files usually contain pg_dump output in +"plain" format. Output the difference if any. + +=over + +=item C<dump1> and C<dump2>: Dump files to compare + +=item C<testname>: test name + +=back + +=cut + +sub compare_dumps +{ + my ($dump1, $dump2, $testname) = @_; + + my $compare_res = compare($dump1, $dump2); + is($compare_res, 0, $testname); + + # Provide more context + if ($compare_res != 0) + { + my ($stdout, $stderr) = + run_command([ 'diff', '-u', $dump1, $dump2 ]); + print "=== diff of $dump1 and $dump2\n"; + print "=== stdout ===\n"; + print $stdout; + print "=== stderr ===\n"; + print $stderr; + print "=== EOF ===\n"; + } + else + { + note('first dump file: ' . $dump1); + note('second dump file: ' . $dump2); + } + + return; +} + +=pod + =back =cut diff --git a/src/test/perl/meson.build b/src/test/perl/meson.build index fc9cf971ea3..3a98ac49daa 100644 --- a/src/test/perl/meson.build +++ b/src/test/perl/meson.build @@ -14,4 +14,5 @@ install_data( 'PostgreSQL/Test/Cluster.pm', 'PostgreSQL/Test/BackgroundPsql.pm', 'PostgreSQL/Test/AdjustUpgrade.pm', + 'PostgreSQL/Test/AdjustDump.pm', install_dir: dir_pgxs / 'src/test/perl/PostgreSQL/Test') diff --git a/src/test/recovery/t/027_stream_regress.pl b/src/test/recovery/t/027_stream_regress.pl index d1ae32d97d6..b5ea1356751 100644 --- a/src/test/recovery/t/027_stream_regress.pl +++ b/src/test/recovery/t/027_stream_regress.pl @@ -116,8 +116,9 @@ command_ok( '--no-sync', '-p', $node_standby_1->port ], 'dump standby server'); -command_ok( - [ 'diff', $outputdir . '/primary.dump', $outputdir . '/standby.dump' ], +compare_dumps( + $outputdir . '/primary.dump', + $outputdir . '/standby.dump', 'compare primary and standby dumps'); # Likewise for the catalogs of the regression database, after disabling @@ -146,12 +147,9 @@ command_ok( 'regression' ], 'dump catalogs of standby server'); -command_ok( - [ - 'diff', - $outputdir . '/catalogs_primary.dump', - $outputdir . '/catalogs_standby.dump' - ], +compare_dumps( + $outputdir . '/catalogs_primary.dump', + $outputdir . '/catalogs_standby.dump', 'compare primary and standby catalog dumps'); # Check some data from pg_stat_statements. -- 2.34.1