On 03/26/18 12:34, Tom Lane wrote: > If that's the argument, why is the WALInsertLockUpdateInsertingAt(CurrPos) > call still there? GetXLogBuffer() would do that too.
"Because I hadn't noticed that," he said, sheepishly. > In any case, the new comment ... fails to > explain what's going on, and the reference to a function that is not > actually called from the vicinity of the comment ... > suggest something like "GetXLogBuffer() will fetch and initialize the > next WAL page for us. ... worth explaining how you know that the new > page's header is short not long. Here are patches responding to that (and also fixing the unintended inclusion of .travis.yml). What I have not done here is respond to Michael's objection, which I haven't had time to think more about yet. -Chap
>From 3606cccfd584654970f56e909798d0d163fa7e96 Mon Sep 17 00:00:00 2001 From: Chapman Flack <c...@anastigmatix.net> Date: Mon, 26 Mar 2018 23:08:26 -0400 Subject: [PATCH 1/2] Zero headers of unused pages after WAL switch. When writing zeroed pages to the remainder of a WAL segment after a WAL switch, ensure that the headers of those pages are also zeroed, as their initialized values otherwise reduce the compressibility of the WAL segment file by general tools. --- src/backend/access/transam/xlog.c | 45 ++++++++++++++++++++++++++++++++++++--- 1 file changed, 42 insertions(+), 3 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index cb9c2a2..cf4eaa1 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -1552,11 +1552,50 @@ CopyXLogRecordToWAL(int write_len, bool isLogSwitch, XLogRecData *rdata, /* Use up all the remaining space on the first page */ CurrPos += freespace; + /* + * Cause all remaining pages in the segment to be flushed, leaving the + * XLog position where it should be at the start of the next segment. In + * the process, the pages will be initialized and physically written to + * the file. That costs a bit of I/O (compared to simply leaving the + * rest of the file unwritten, as was once done), but has an advantage + * that the tail of the file will contain predictable (ideally constant) + * data, so that general-purpose compression tools perform well on it. + * (If left unwritten, the tail contains whatever is left over from the + * file's last use as an earlier segment, and may compress poorly.) The + * I/O cost is of little concern because any period when WAL segments + * are being switched out (for, e.g., checkpoint timeout reasons) before + * they are filled is clearly a low-workload period. + */ while (CurrPos < EndPos) { - /* initialize the next page (if not initialized already) */ - WALInsertLockUpdateInsertingAt(CurrPos); - AdvanceXLInsertBuffer(CurrPos, false); + /* + * This loop only touches full pages that follow the last actually- + * used data in the segment. It will never touch the first page of a + * segment; we would not be here to switch out a segment to which + * nothing at all had been written. + * + * The minimal actions to flush the page would be to call + * WALInsertLockUpdateInsertingAt(CurrPos) followed by + * AdvanceXLInsertBuffer(...). The page would be left initialized + * mostly to zeros, except for the page header (always the short + * variant, as this is never a segment's first page). + * + * The large vistas of zeros are good for compressibility, but + * the short headers interrupting them every XLOG_BLCKSZ (and, + * worse, with values that differ from page to page) are not. The + * effect varies with compression tool, but bzip2 (which compressed + * best among several common tools on scantly-filled WAL segments) + * compresses about an order of magnitude worse if those headers + * are left in place. + * + * Rather than complicating AdvanceXLInsertBuffer itself (which is + * called in heavily-loaded circumstances as well as this lightly- + * loaded one) with variant behavior, we just use GetXLogBuffer + * (which itself calls the two methods we need) to get the pointer + * and re-zero the short header. + */ + currpos = GetXLogBuffer(CurrPos); + MemSet(currpos, 0, SizeOfXLogShortPHD); CurrPos += XLOG_BLCKSZ; } } -- 2.7.3
>From f9549ec41d84e60964887d4784abe4562b0bda0f Mon Sep 17 00:00:00 2001 From: Chapman Flack <c...@anastigmatix.net> Date: Mon, 26 Mar 2018 23:09:04 -0400 Subject: [PATCH 2/2] Add test for ensuring WAL segment is zeroed out Switch to a new WAL file and ensure the last block in the now switched-away-from file is completely zeroed out. This adds a mode to check_pg_config() where the match from the regexp can be returned, not just the literal count of matches. Only one match is returned (the first), but given the structure of pg_config.h that's likely to be enough. --- src/test/perl/TestLib.pm | 18 ++++++++++++++++-- src/test/recovery/t/002_archiving.pl | 19 ++++++++++++++++++- 2 files changed, 34 insertions(+), 3 deletions(-) diff --git a/src/test/perl/TestLib.pm b/src/test/perl/TestLib.pm index b686268..0f28bc8 100644 --- a/src/test/perl/TestLib.pm +++ b/src/test/perl/TestLib.pm @@ -245,15 +245,29 @@ sub append_to_file # that. sub check_pg_config { - my ($regexp) = @_; + my ($regexp, $value) = @_; my ($stdout, $stderr); + my $match; my $result = IPC::Run::run [ 'pg_config', '--includedir' ], '>', \$stdout, '2>', \$stderr or die "could not execute pg_config"; chomp($stdout); open my $pg_config_h, '<', "$stdout/pg_config.h" or die "$!"; - my $match = (grep {/^$regexp/} <$pg_config_h>); + if (defined $value) + { + while (<$pg_config_h>) + { + $_ =~ m/^$regexp/; + next unless defined $1; + $match = $1; + last; + } + } + else + { + $match = (grep {/^$regexp/} <$pg_config_h>); + } close $pg_config_h; return $match; } diff --git a/src/test/recovery/t/002_archiving.pl b/src/test/recovery/t/002_archiving.pl index e1bd3c9..9551db1 100644 --- a/src/test/recovery/t/002_archiving.pl +++ b/src/test/recovery/t/002_archiving.pl @@ -3,7 +3,7 @@ use strict; use warnings; use PostgresNode; use TestLib; -use Test::More tests => 1; +use Test::More tests => 2; use File::Copy; # Initialize master node, doing archives @@ -49,3 +49,20 @@ $node_standby->poll_query_until('postgres', $caughtup_query) my $result = $node_standby->safe_psql('postgres', "SELECT count(*) FROM tab_int"); is($result, qq(1000), 'check content from archives'); + +# Add some content, switch WAL file and read the last segment of the WAL +# file which should be zeroed out +my $current_file = $node_master->safe_psql('postgres', + "SELECT pg_walfile_name(pg_current_wal_lsn())"); +$node_master->safe_psql('postgres', "INSERT INTO tab_int VALUES (1)"); +$node_master->safe_psql('postgres', "SELECT pg_switch_wal()"); + +my $xlog_blcksz = check_pg_config('#define XLOG_BLCKSZ (\d+)', 1); + +open my $fh, '<:raw', $node_master->data_dir . '/pg_wal/' . $current_file; +# SEEK_END is defined as 2, but is not allowed when using strict mode +seek $fh, ($xlog_blcksz * -1), 2; +my $len = read($fh, my $bytes, $xlog_blcksz); +close $fh; + +ok($bytes =~ /\A\x00*+\z/, 'make sure wal segment is zeroed'); -- 2.7.3