Hi,

On Tue, Apr 03, 2018 at 08:48:08PM +0200, Magnus Hagander wrote:
> On Tue, Apr 3, 2018 at 8:29 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> I'd bet a good lunch that nondefault BLCKSZ would break it, as well,
> > since the way in which the corruption is induced is just guessing
> > as to where page boundaries are.
> 
> Yeah, that might be a problem. Those should be calculated from the block
> size.
> 
> Also, scribbling on tables as sensitive as pg_class is just asking for
> > trouble IMO.  I don't see anything in this test, for example, that
> > prevents autovacuum from running and causing a PANIC before the test
> > can complete.  Even with AV off, there's a good chance that clobber-
> > cache-always animals will fall over because they do so many more
> > physical accesses to the system catalogs.  I'd suggest inducing the
> > corruption in some user table(s) that we can more tightly constrain
> > the source server's accesses to.
> 
> Yeah, that seems like a good idea. And probably also shut the server down
> while writing the corruption, just in case.
> 
> Will stick looking into that on my todo for when I'm back, unless beaten to
> it. Michael, you want a stab at it?

Attached is a patch which does that hopefully:

1. creates two user tables, one large enough for at least 6 blocks
(around 360kb), the other just one block.

2. stops the cluster before scribbling over its data and starts it
afterwards.

3. uses the blocksize (and the pager header size) to determine offsets
for scribbling.

I've tested it with blocksizes 8 and 32 now, the latter should make sure
that the first table is indeed large enough, but maybe something less
arbitrary than "10000 integers" should be used?

Anyway, sorry for the hassle.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index 3162cdcd01..3fe49f68a7 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -406,19 +406,25 @@ like(
 my $checksum = $node->safe_psql('postgres', 'SHOW data_checksums;');
 is($checksum, 'on', 'checksums are enabled');
 
-# get relfilenodes of relations to corrupt
-my $pg_class = $node->safe_psql('postgres',
-	q{SELECT pg_relation_filepath('pg_class')}
+# create tables to corrupt and get their relfilenodes
+my $file_corrupt1 = $node->safe_psql('postgres',
+        q{SELECT a INTO corrupt1 FROM generate_series(1,10000) AS a; SELECT pg_relation_filepath('corrupt1')}
 );
-my $pg_index = $node->safe_psql('postgres',
-	q{SELECT pg_relation_filepath('pg_index')}
+my $file_corrupt2 = $node->safe_psql('postgres',
+        q{SELECT b INTO corrupt2 FROM generate_series(1,2) AS b; SELECT pg_relation_filepath('corrupt2')}
 );
 
+# set page header and block sizes
+my $pageheader_size = 24;
+my $block_size = $node->safe_psql('postgres', 'SHOW block_size;');
+
 # induce corruption
-open $file, '+<', "$pgdata/$pg_class";
-seek($file, 4000, 0);
+system_or_bail 'pg_ctl', '-D', $pgdata, 'stop';
+open $file, '+<', "$pgdata/$file_corrupt1";
+seek($file, $pageheader_size, 0);
 syswrite($file, '\0\0\0\0\0\0\0\0\0');
 close $file;
+system_or_bail 'pg_ctl', '-D', $pgdata, 'start';
 
 $node->command_checks_all([ 'pg_basebackup', '-D', "$tempdir/backup_corrupt"],
 	1,
@@ -428,13 +434,15 @@ $node->command_checks_all([ 'pg_basebackup', '-D', "$tempdir/backup_corrupt"],
 );
 
 # induce further corruption in 5 more blocks
-open $file, '+<', "$pgdata/$pg_class";
-my @offsets = (12192, 20384, 28576, 36768, 44960);
-foreach my $offset (@offsets) {
+system_or_bail 'pg_ctl', '-D', $pgdata, 'stop';
+open $file, '+<', "$pgdata/$file_corrupt1";
+for my $i ( 1..5 ) {
+  my $offset = $pageheader_size + $i * $block_size;
   seek($file, $offset, 0);
   syswrite($file, '\0\0\0\0\0\0\0\0\0');
 }
 close $file;
+system_or_bail 'pg_ctl', '-D', $pgdata, 'start';
 
 $node->command_checks_all([ 'pg_basebackup', '-D', "$tempdir/backup_corrupt2"],
         1,
@@ -444,10 +452,12 @@ $node->command_checks_all([ 'pg_basebackup', '-D', "$tempdir/backup_corrupt2"],
 );
 
 # induce corruption in a second file
-open $file, '+<', "$pgdata/$pg_index";
+system_or_bail 'pg_ctl', '-D', $pgdata, 'stop';
+open $file, '+<', "$pgdata/$file_corrupt2";
 seek($file, 4000, 0);
 syswrite($file, '\0\0\0\0\0\0\0\0\0');
 close $file;
+system_or_bail 'pg_ctl', '-D', $pgdata, 'start';
 
 $node->command_checks_all([ 'pg_basebackup', '-D', "$tempdir/backup_corrupt3"],
         1,
@@ -460,3 +470,6 @@ $node->command_checks_all([ 'pg_basebackup', '-D', "$tempdir/backup_corrupt3"],
 $node->command_ok(
 	[   'pg_basebackup', '-D', "$tempdir/backup_corrupt4", '-k' ],
 	'pg_basebackup with -k does not report checksum mismatch');
+
+$node->safe_psql('postgres', "DROP TABLE corrupt1;");
+$node->safe_psql('postgres', "DROP TABLE corrupt2;");

Reply via email to