While trying to make sense of some recent buildfarm failures, I happened to notice that the default query issued by the TAP sub wait_for_catchup looks like
SELECT pg_current_wal_lsn() <= replay_lsn AND state = 'streaming' FROM pg_catalog.pg_stat_replication WHERE application_name = '<whatever>'; ISTM there are two things wrong with this: 1. Since pg_current_wal_lsn() is re-evaluated each time, we're effectively setting a moving target for the standby to reach. Admittedly we're not going to be issuing any new DML while waiting in wait_for_catchup, but background activity such as autovacuum could be creating new WAL. Thus, the test is likely to wait longer than it needs to. In the worst case, we'd never catch up until the primary server has been totally quiescent for awhile. 2. Aside from being slower than necessary, this also makes the test squishy and formally incorrect, because the standby might get the opportunity to replay more WAL than the test intends. So I think we need to fix it to capture the target WAL position at the start, as I've done in the attached patch. In principle this might make things a bit slower because of the extra transaction required, but I don't notice any above-the-noise difference on my own workstation. Another thing that is bothering me a bit is that a number of the callers use $node->lsn('insert') as the target. This also seems rather dubious, because that could be ahead of what's been written out. These callers are just taking it on faith that something will eventually cause that extra WAL to get written out (and become available to the standby). Again, that seems to make the test slower than it need be, with a worst-case scenario being that it eventually times out. Admittedly this is unlikely to be a big problem unless some background op issues an abortive transaction at just the wrong time. Nonetheless, I wonder if we shouldn't standardize on "thou shalt use the write position", because I don't think the other alternatives have anything to recommend them. I've not addressed that below, though I did tweak the comment about that parameter. Thoughts? regards, tom lane
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm index e18f27276c..fb6cc39db4 100644 --- a/src/test/perl/PostgreSQL/Test/Cluster.pm +++ b/src/test/perl/PostgreSQL/Test/Cluster.pm @@ -2509,8 +2509,10 @@ poll_query_until timeout. Requires that the 'postgres' db exists and is accessible. -target_lsn may be any arbitrary lsn, but is typically $primary_node->lsn('insert'). -If omitted, pg_current_wal_lsn() is used. +The default value of target_lsn is $node->lsn('write'), which ensures +that the standby has caught up to what has been committed on the primary. +Another plausible choice is $node->lsn('insert'), which ensures that +preceding executed-but-not-committed WAL has been replayed as well. This is not a test. It die()s on failure. @@ -2531,23 +2533,18 @@ sub wait_for_catchup { $standby_name = $standby_name->name; } - my $lsn_expr; - if (defined($target_lsn)) + if (!defined($target_lsn)) { - $lsn_expr = "'$target_lsn'"; - } - else - { - $lsn_expr = 'pg_current_wal_lsn()'; + $target_lsn = $self->lsn('write'); } print "Waiting for replication conn " . $standby_name . "'s " . $mode . "_lsn to pass " - . $lsn_expr . " on " + . $target_lsn . " on " . $self->name . "\n"; my $query = - qq[SELECT $lsn_expr <= ${mode}_lsn AND state = 'streaming' FROM pg_catalog.pg_stat_replication WHERE application_name = '$standby_name';]; + qq[SELECT '$target_lsn' <= ${mode}_lsn AND state = 'streaming' FROM pg_catalog.pg_stat_replication WHERE application_name = '$standby_name';]; $self->poll_query_until('postgres', $query) or croak "timed out waiting for catchup"; print "done\n";