The attached proposed patch changes the TAP test infrastructure's
poll_query_until function in two ways:

1. An optional argument is added to allow specifying the query result
value we're waiting for, overriding the normal "t".  This allows
folding a handwritten delay loop in 007_sync_rep.pl into the
poll_query_until ecosystem.  As far as I've found, there are no other
handwritten delay loops in the TAP tests.

2. poll_query_until is modified to probe 10X per second not once
per second, in keeping with the changes I've been making elsewhere
to remove not-carefully-analyzed 1s delays in the regression tests.

On my workstation, the reduced probe delay shaves a useful amount
of time off the recovery and subscription regression tests.  I also
tried it on dromedary, which is about the slowest hardware I'd care
to run the TAP tests on regularly, and it seems to be about a wash
there --- some tests get faster, but some get slower, presumably due
to more overhead from the probe queries.

I notice that buildfarm member skink (which runs with valgrind)
recently failed a test run due to poll_query_until timing out:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2017-06-30%2000%3A50%3A01

I'm inclined to respond to that either by increasing the fixed
180-second timeout, or by making it configurable from an environment
variable (which Andres would then have to add to skink's configuration).

Thoughts?

                        regards, tom lane

diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index f4fa600..4346423 100644
*** a/src/test/perl/PostgresNode.pm
--- b/src/test/perl/PostgresNode.pm
*************** sub psql
*** 1213,1248 ****
  
  =pod
  
! =item $node->poll_query_until(dbname, query)
  
! Run a query once a second, until it returns 't' (i.e. SQL boolean true).
! Continues polling if psql returns an error result. Times out after 180 seconds.
  
  =cut
  
  sub poll_query_until
  {
! 	my ($self, $dbname, $query) = @_;
  
! 	my $max_attempts = 180;
! 	my $attempts     = 0;
  	my ($stdout, $stderr);
  
  	while ($attempts < $max_attempts)
  	{
- 		my $cmd =
- 		  [ 'psql', '-XAt', '-c', $query, '-d', $self->connstr($dbname) ];
  		my $result = IPC::Run::run $cmd, '>', \$stdout, '2>', \$stderr;
  
  		chomp($stdout);
  		$stdout =~ s/\r//g if $TestLib::windows_os;
! 		if ($stdout eq "t")
  		{
  			return 1;
  		}
  
! 		# Wait a second before retrying.
! 		sleep 1;
  		$attempts++;
  	}
  
--- 1213,1255 ----
  
  =pod
  
! =item $node->poll_query_until($dbname, $query [, $expected ])
  
! Run B<$query> repeatedly, until it returns the B<$expected> result
! ('t', or SQL boolean true, by default).
! Continues polling if B<psql> returns an error result.
! Times out after 180 seconds.
! Returns 1 if successful, 0 if timed out.
  
  =cut
  
  sub poll_query_until
  {
! 	my ($self, $dbname, $query, $expected) = @_;
  
! 	$expected = 't' unless defined($expected);	# default value
! 
! 	my $cmd =
! 		[ 'psql', '-XAt', '-c', $query, '-d', $self->connstr($dbname) ];
  	my ($stdout, $stderr);
+ 	my $max_attempts = 180 * 10;
+ 	my $attempts     = 0;
  
  	while ($attempts < $max_attempts)
  	{
  		my $result = IPC::Run::run $cmd, '>', \$stdout, '2>', \$stderr;
  
  		chomp($stdout);
  		$stdout =~ s/\r//g if $TestLib::windows_os;
! 
! 		if ($stdout eq $expected)
  		{
  			return 1;
  		}
  
! 		# Wait 0.1 second before retrying.
! 		select undef, undef, undef, 0.1;
! 
  		$attempts++;
  	}
  
diff --git a/src/test/recovery/t/007_sync_rep.pl b/src/test/recovery/t/007_sync_rep.pl
index 8e3cc5e..0f999f0 100644
*** a/src/test/recovery/t/007_sync_rep.pl
--- b/src/test/recovery/t/007_sync_rep.pl
*************** use Test::More tests => 11;
*** 9,15 ****
  my $check_sql =
  "SELECT application_name, sync_priority, sync_state FROM pg_stat_replication ORDER BY application_name;";
  
! # Check that sync_state of each standby is expected.
  # If $setting is given, synchronous_standby_names is set to it and
  # the configuration file is reloaded before the test.
  sub test_sync_state
--- 9,15 ----
  my $check_sql =
  "SELECT application_name, sync_priority, sync_state FROM pg_stat_replication ORDER BY application_name;";
  
! # Check that sync_state of each standby is expected (waiting till it is).
  # If $setting is given, synchronous_standby_names is set to it and
  # the configuration file is reloaded before the test.
  sub test_sync_state
*************** sub test_sync_state
*** 23,46 ****
  		$self->reload;
  	}
  
! 	my $timeout_max = 30;
! 	my $timeout     = 0;
! 	my $result;
! 
! 	# A reload may take some time to take effect on busy machines,
! 	# hence use a loop with a timeout to give some room for the test
! 	# to pass.
! 	while ($timeout < $timeout_max)
! 	{
! 		$result = $self->safe_psql('postgres', $check_sql);
! 
! 		last if ($result eq $expected);
! 
! 		$timeout++;
! 		sleep 1;
! 	}
! 
! 	is($result, $expected, $msg);
  }
  
  # Initialize master node
--- 23,29 ----
  		$self->reload;
  	}
  
! 	ok( $self->poll_query_until('postgres', $check_sql, $expected), $msg);
  }
  
  # Initialize master node
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to