On 2022-10-17 Mo 10:59, Andrew Dunstan wrote: > On 2022-10-04 Tu 01:39, Andrew Dunstan wrote: >> On 2022-10-02 Su 12:49, Andres Freund wrote: >>> 2) Use a lockfile containing a pid to protect the choice of a port within a >>> build directory. Before accepting a port get_free_port() would check if >>> the >>> a lockfile exists for the port and if so, if the test using it is still >>> alive. That will protect against racyness between multiple tests inside >>> a >>> build directory, but won't protect against races between multiple builds >>> running concurrently on a machine (e.g. on a buildfarm host) >>> >>> >> I think this is the right solution. To deal with the last issue, the >> lockdir should be overrideable, like this: >> >> >> my $port_lockdir = $ENV{PG_PORT_LOCKDIR} || $build_dir; >> >> >> Buildfarm animals could set this, probably to the global lockdir (see >> run_branches.pl). Prior to that, buildfarm owners could do that manually. >> >> > The problem here is that Cluster.pm doesn't have any idea where the > build directory is, or even if there is one present at all. > > meson does appear to let us know that, however, with MESON_BUILD_ROOT, > so probably the best thing would be to use PG_PORT_LOCKDIR if it's set, > otherwise MESON_BUILD_ROOT if it's set, otherwise the tmp_check > directory. If we want to backport to the make system we could export > top_builddir somewhere. > >
Here's a patch which I think does the right thing. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm index d80134b26f..2594698f1f 100644 --- a/src/test/perl/PostgreSQL/Test/Cluster.pm +++ b/src/test/perl/PostgreSQL/Test/Cluster.pm @@ -93,9 +93,9 @@ use warnings; use Carp; use Config; -use Fcntl qw(:mode); +use Fcntl qw(:mode :flock :seek O_CREAT O_RDWR); use File::Basename; -use File::Path qw(rmtree); +use File::Path qw(rmtree mkpath); use File::Spec; use File::stat qw(stat); use File::Temp (); @@ -109,7 +109,7 @@ use Time::HiRes qw(usleep); use Scalar::Util qw(blessed); our ($use_tcp, $test_localhost, $test_pghost, $last_host_assigned, - $last_port_assigned, @all_nodes, $died); + $last_port_assigned, @all_nodes, $died, $portdir); # the minimum version we believe to be compatible with this package without # subclassing. @@ -140,6 +140,29 @@ INIT # Tracking of last port value assigned to accelerate free port lookup. $last_port_assigned = int(rand() * 16384) + 49152; + + # Set the port lock directory + + # If we're told to use a directory (e.g. from a buildfarm client) + # explicitly, use that + $portdir = $ENV{PG_TEST_PORT_DIR}; + # Otherwise, try to use a directory at the top of the build tree + if (! $portdir && exists $ENV{MESON_BUILD_ROOT}) + { + my $dir = $ENV{MESON_BUILD_ROOT}; + $dir =~ s!\\!/!g; + $portdir = "$dir/portlock" if $dir; + } + elsif (! $portdir && $ENV{TESTDATADIR} =~ /\W(src|contrib)\W/p) + { + my $dir = ${^PREMATCH}; + $dir =~ s!\\!/!g; + $portdir = "$dir/portlock" if $dir; + } + # As a last resort use a temp directory under tmp_check + $portdir ||= PostgreSQL::Test::Utils::tempdir('portlock'); + # Make sure the directory exists + mkpath($portdir) unless -d $portdir; } =pod @@ -1505,6 +1528,7 @@ sub get_free_port last; } } + $found = _reserve_port($port) if $found; } } @@ -1535,6 +1559,37 @@ sub can_bind return $ret; } +# internal routine to reserve a port number +# returns 1 if successful, 0 if port is already reserved. +sub _reserve_port +{ + my $port = shift; + # open in rw mode so we don't have to reopen it and lose the lock + sysopen(my $portfile, "$portdir/$port.rsv", O_RDWR|O_CREAT) + || die "opening port file"; + # take an exclusive lock to avoid concurrent access + flock($portfile, LOCK_EX) || die "locking port file"; + # see if someone else has or had a reservation of this port + my $pid = <$portfile>; + chomp $pid; + if ($pid +0 > 0) + { + if (kill 0, $pid) + { + # process exists and is owned by us, so we can't reserve this port + close($portfile); + return 0; + } + } + # all good, go ahead and reserve the port + # first rewind and truncate the file + seek($portfile, 0, SEEK_SET); + truncate($portfile, 0); + print $portfile "$$\n"; + close($portfile); + return 1; +} + # Automatically shut down any still-running nodes (in the same order the nodes # were created in) when the test script exits. END