Looks like this needed another rebase to account for the oauth commit.
Rebase attached.

On Mon, Feb 24, 2025 at 9:38 AM Andrew Jackson <andrewjackson...@gmail.com>
wrote:

> Hi,
>
> Thank you for the review!
>
> Review Response
>
> - Made a first pass at a real commit message
> - Fixed the condition on the if statement to use strcmp
> - Added a test suite in the files `src/interfaces/libpq/t/
> 006_target_session_attr_dns.pl` and `src/interfaces/libpq/t/
> 007_load_balance_dns_check_all_addrs.pl` which checks the
> target_session_attrs as when used with and without load balancing.
>
> Regarding the name of the variable itself I am definitely open to opinions
> on this. I didn't put too much thought initially and just chose
> `check_all_addrs`. I feel like given that it modifies the behaviour of
> `target_session_attrs` ideally it should reference that in the name but
> that would make that variable name very long: something akin to
> `target_session_attrs_check_all_addrs`.
>
> Context
>
> I tested some drivers as well and found that pgx, psycopg, and
> rust-postgres all traverse every IP address when looking for a matching
> target_session_attrs. Asyncpg and psycopg2 on the other hand follow libpq
> and terminate additional attempts after the first failure. Given this it
> seems like there is a decent amount of fragmentation in the ecosystem as to
> how exactly to implement this feature. I believe some drivers choose to
> traverse all addresses because they have users target the same use case
> outlined above.
>
> Thanks again,
> Andrew Jackson
>
> On Sun, Feb 16, 2025 at 6:03 AM Andrey Borodin <x4...@yandex-team.ru>
> wrote:
>
>> Hi Andrew!
>>
>> cc Jelte, I suspect he might be interested.
>>
>> > On 20 Nov 2024, at 20:51, Andrew Jackson <andrewjackson...@gmail.com>
>> wrote:
>> >
>> > Would appreciate any feedback on the applicability/relevancy of the
>> goal here or the implementation.
>>
>> Thank you for raising the issue. Following our discussion in Discord I'm
>> putting my thoughts to list.
>>
>>
>> Context
>>
>> A DNS record might return several IPs. Consider we have a connection
>> string with "host=A,B", A is resolved to 1.1.1.1,2.2.2.2, B to
>> 3.3.3.3,4.4.4.4.
>> If we connect with "target_session_attrs=read-write" IPs 1.1.1.1 and
>> 3.3.3.3 will be probed, but 2.2.2.2 and 4.4.4.4 won't (if 1.1.1.1 and
>> 3.3.3.3 responded).
>>
>> If we enable libpq load balancing some random 2 IPs will be probed.
>>
>> IMO it's a bug, at least when load balancing is enabled. Let's consider
>> if we can change default behavior here. I suspect we can't do it for
>> "load_balance_hosts=disable". And even for load balancing this might be too
>> unexpected change for someone.
>>
>> Further I only consider proposal not as a bug fix, but as a feature.
>>
>> In Discord we have surveyed some other drivers.
>> pgx treats all IPs as different servers [1]. npgsql goes through all IPs
>> one-by-one always [2]. PGJDBC are somewhat in a decision process [3] (cc
>> Dave and Vladimir, if they would like to provide some input).
>>
>>
>> Review
>>
>> The patch needs a rebase. It's trivial, so please fine attached. The
>> patch needs real commit message, it's not trivial :)
>>
>> We definitely need to adjust tests [0]. We need to change
>> 004_load_balance_dns.pl so that it tests target_session_attrs too.
>>
>> Some documentation would be nice.
>>
>> I do not like how this check is performed
>> +                                               if (conn->check_all_addrs
>> && conn->check_all_addrs[0] == '1')
>> Let's make it like load balancing is done [4].
>>
>> Finally, let's think about naming alternatives for "check_all_addrs".
>>
>> I think that's enough for a first round of the review. If it's not a bug,
>> but a feature - it's a very narrow window to get to 18. But we might be
>> lucky...
>>
>> Thank you!
>>
>>
>> Best regards, Andrey Borodin.
>>
>> [0]
>> https://github.com/postgres/postgres/commit/7f5b19817eaf38e70ad1153db4e644ee9456853e#diff-b05b74d2a97d7f74d4311ba1702d732f0df1b101c6ac99c146b51215174cf3ffR94
>> [1] https://github.com/jackc/pgx/blob/master/pgconn/pgconn.go#L177
>> [2]
>> https://github.com/npgsql/npgsql/blob/7f1a59fa8dc1ccc34a70154f49a768e1abf826ba/src/Npgsql/Internal/NpgsqlConnector.cs#L986
>> [3] https://github.com/pgjdbc/pgjdbc/pull/3012#discussion_r1408069450
>> [4]
>> https://github.com/postgres/postgres/commit/7f5b19817eaf38e70ad1153db4e644ee9456853e#diff-8d819454e061b9d4cdae9c8922ded05753a629d70f2ac1de1d4f6d5a4aeb7f68R1660
>>
>
From f2a3b1e51824da7b4a3eb096ed2215d64e617abd Mon Sep 17 00:00:00 2001
From: Andrey Borodin <amboro...@acm.org>
Date: Fri, 14 Feb 2025 23:13:55 +0500
Subject: [PATCH v4] Add option to check all addrs for target_session.

The current behaviour of libpq with regard to searching
for a matching target_session_attrs in a list of addrs is
that after successfully connecting to a server if the servers
session_attr does not match the request target_session_attrs
no futher address is considered. This behaviour is extremely
inconvenient in environments where the user is attempting to
implement a high availability setup without having to modify
DNS records or a proxy server config.

This PR adds a client side option called check_all_addrs.
When set to 1 this option will tell libpq to continue checking
any remaining addresses even if there was a target_session_attrs
mismatch on one of them.

Author: Andrew Jackson
Reviewed-by: Andrey Borodin
Discussion: https://www.postgresql.org/message-id/flat/CAKK5BkESSc69sp2TiTWHvvOHCUey0rDWXSrR9pinyRqyfamUYg%40mail.gmail.com
---
 doc/src/sgml/libpq.sgml                       |  33 +++++
 src/interfaces/libpq/fe-connect.c             |  25 ++--
 src/interfaces/libpq/libpq-int.h              |   1 +
 .../libpq/t/006_target_session_attr_dns.pl    | 129 ++++++++++++++++++
 .../t/007_load_balance_dns_check_all_addrs.pl | 128 +++++++++++++++++
 5 files changed, 306 insertions(+), 10 deletions(-)
 create mode 100644 src/interfaces/libpq/t/006_target_session_attr_dns.pl
 create mode 100644 src/interfaces/libpq/t/007_load_balance_dns_check_all_addrs.pl

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index ddb3596df83..c5d29d7a0f9 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -2483,6 +2483,39 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
       </listitem>
      </varlistentry>
 
+     <varlistentry id="libpq-connect-check-all-addrs" xreflabel="check_all_addrs">
+      <term><literal>check_all_addrs</literal></term>
+      <listitem>
+       <para>
+        Controls whether or not all addresses within a hostname are checked when trying to make a connection
+        when attempting to find a connection with a matching <xref linkend="libpq-connect-target-session-attr"/>.
+
+        There are two modes:
+        <variablelist>
+         <varlistentry>
+          <term><literal>0</literal> (default)</term>
+          <listitem>
+           <para>
+            If a successful connection is made and that connection is found to have a
+            mismatching <xref linkend="libpq-connect-target-session-attr"/> do not check
+            any additional addresses and move onto the next host if one was provided.
+           </para>
+          </listitem>
+         </varlistentry>
+         <varlistentry>
+          <term><literal>1</literal></term>
+          <listitem>
+           <para>
+            If a successful connection is made and that connection is found to have a
+            mismatching <xref linkend="libpq-connect-target-session-attr"/> proceed
+            to check any additional addresses.
+           </para>
+          </listitem>
+         </varlistentry>
+        </variablelist>
+       </para>
+      </listitem>
+     </varlistentry>
     </variablelist>
    </para>
   </sect2>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index d5051f5e820..bd0265c553e 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -373,6 +373,10 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
 
 	{"scram_server_key", NULL, NULL, NULL, "SCRAM-Server-Key", "D", SCRAM_MAX_KEY_LEN * 2,
 	offsetof(struct pg_conn, scram_server_key)},
+	{"check_all_addrs", "PGCHECKALLADDRS",
+		DefaultLoadBalanceHosts, NULL,
+		"Check-All-Addrs", "", 1,
+	offsetof(struct pg_conn, check_all_addrs)},
 
 	/* OAuth v2 */
 	{"oauth_issuer", NULL, NULL, NULL,
@@ -4362,11 +4366,11 @@ keep_going:						/* We will come back to here until there is
 						conn->status = CONNECTION_OK;
 						sendTerminateConn(conn);
 
-						/*
-						 * Try next host if any, but we don't want to consider
-						 * additional addresses for this host.
-						 */
-						conn->try_next_host = true;
+						if (strcmp(conn->check_all_addrs, "1") == 0)
+							conn->try_next_addr = true;
+						else
+							conn->try_next_host = true;
+
 						goto keep_going;
 					}
 				}
@@ -4417,11 +4421,11 @@ keep_going:						/* We will come back to here until there is
 						conn->status = CONNECTION_OK;
 						sendTerminateConn(conn);
 
-						/*
-						 * Try next host if any, but we don't want to consider
-						 * additional addresses for this host.
-						 */
-						conn->try_next_host = true;
+						if (strcmp(conn->check_all_addrs, "1") == 0)
+							conn->try_next_addr = true;
+						else
+							conn->try_next_host = true;
+
 						goto keep_going;
 					}
 				}
@@ -5047,6 +5051,7 @@ freePGconn(PGconn *conn)
 	free(conn->oauth_client_id);
 	free(conn->oauth_client_secret);
 	free(conn->oauth_scope);
+	free(conn->check_all_addrs);
 	termPQExpBuffer(&conn->errorMessage);
 	termPQExpBuffer(&conn->workBuffer);
 
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index f36f7f19d58..27f162c1c29 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -432,6 +432,7 @@ struct pg_conn
 	char	   *load_balance_hosts; /* load balance over hosts */
 	char	   *scram_client_key;	/* base64-encoded SCRAM client key */
 	char	   *scram_server_key;	/* base64-encoded SCRAM server key */
+	char       *check_all_addrs;  /* whether to check all ips within a host or terminate on failure */
 
 	bool		cancelRequest;	/* true if this connection is used to send a
 								 * cancel request, instead of being a normal
diff --git a/src/interfaces/libpq/t/006_target_session_attr_dns.pl b/src/interfaces/libpq/t/006_target_session_attr_dns.pl
new file mode 100644
index 00000000000..914f6c472f4
--- /dev/null
+++ b/src/interfaces/libpq/t/006_target_session_attr_dns.pl
@@ -0,0 +1,129 @@
+
+# Copyright (c) 2023-2025, PostgreSQL Global Development Group
+use strict;
+use warnings FATAL => 'all';
+use Config;
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+if (!$ENV{PG_TEST_EXTRA} || $ENV{PG_TEST_EXTRA} !~ /\bload_balance\b/)
+{
+	plan skip_all =>
+	  'Potentially unsafe test load_balance not enabled in PG_TEST_EXTRA';
+}
+
+# Cluster setup which is shared for testing both load balancing methods
+my $can_bind_to_127_0_0_2 =
+  $Config{osname} eq 'linux' || $PostgreSQL::Test::Utils::windows_os;
+
+# Checks for the requirements for testing load balancing method 2
+if (!$can_bind_to_127_0_0_2)
+{
+	plan skip_all => 'load_balance test only supported on Linux and Windows';
+}
+
+my $hosts_path;
+if ($windows_os)
+{
+	$hosts_path = 'c:\Windows\System32\Drivers\etc\hosts';
+}
+else
+{
+	$hosts_path = '/etc/hosts';
+}
+
+my $hosts_content = PostgreSQL::Test::Utils::slurp_file($hosts_path);
+
+my $hosts_count = () =
+  $hosts_content =~ /127\.0\.0\.[1-3] pg-loadbalancetest/g;
+if ($hosts_count != 3)
+{
+	# Host file is not prepared for this test
+	plan skip_all => "hosts file was not prepared for DNS load balance test";
+}
+
+$PostgreSQL::Test::Cluster::use_tcp = 1;
+$PostgreSQL::Test::Cluster::test_pghost = '127.0.0.1';
+my $port = PostgreSQL::Test::Cluster::get_free_port();
+
+my $node_primary1 = PostgreSQL::Test::Cluster->new('primary1', port => $port);
+$node_primary1->init(has_archiving => 1, allows_streaming => 1);
+
+# Start it
+$node_primary1->start;
+
+# Take backup from which all operations will be run
+$node_primary1->backup('my_backup');
+
+my $node_standby = PostgreSQL::Test::Cluster->new('standby', port => $port, own_host => 1);
+$node_standby->init_from_backup($node_primary1, 'my_backup',
+	has_restoring => 1);
+$node_standby->start();
+
+my $node_primary2 = PostgreSQL::Test::Cluster->new('node1', port => $port, own_host => 1);
+$node_primary2 ->init();
+$node_primary2 ->start();
+
+# target_session_attrs=primary should always choose the first one.
+$node_primary1->connect_ok(
+	"host=pg-loadbalancetest port=$port target_session_attrs=primary check_all_addrs=1",
+	"target_session_attrs=primary connects to the first node",
+	sql => "SELECT 'connect1'",
+	log_like => [qr/statement: SELECT 'connect1'/]);
+$node_primary1->connect_ok(
+	"host=pg-loadbalancetest port=$port target_session_attrs=read-write check_all_addrs=1",
+	"target_session_attrs=read-write connects to the first node",
+	sql => "SELECT 'connect1'",
+	log_like => [qr/statement: SELECT 'connect1'/]);
+$node_primary1->connect_ok(
+	"host=pg-loadbalancetest port=$port target_session_attrs=any check_all_addrs=1",
+	"target_session_attrs=any connects to the first node",
+	sql => "SELECT 'connect1'",
+	log_like => [qr/statement: SELECT 'connect1'/]);
+$node_standby->connect_ok(
+	"host=pg-loadbalancetest port=$port target_session_attrs=standby check_all_addrs=1",
+	"target_session_attrs=standby connects to the third node",
+	sql => "SELECT 'connect1'",
+	log_like => [qr/statement: SELECT 'connect1'/]);
+$node_standby->connect_ok(
+	"host=pg-loadbalancetest port=$port target_session_attrs=read-only check_all_addrs=1",
+	"target_session_attrs=read-only connects to the third node",
+	sql => "SELECT 'connect1'",
+	log_like => [qr/statement: SELECT 'connect1'/]);
+
+
+$node_primary1->stop();
+
+# target_session_attrs=primary should always choose the first one.
+$node_primary2->connect_ok(
+	"host=pg-loadbalancetest port=$port target_session_attrs=primary check_all_addrs=1",
+	"target_session_attrs=primary connects to the first node",
+	sql => "SELECT 'connect1'",
+	log_like => [qr/statement: SELECT 'connect1'/]);
+$node_primary2->connect_ok(
+	"host=pg-loadbalancetest port=$port target_session_attrs=read-write check_all_addrs=1",
+	"target_session_attrs=read-write connects to the first node",
+	sql => "SELECT 'connect1'",
+	log_like => [qr/statement: SELECT 'connect1'/]);
+$node_standby->connect_ok(
+	"host=pg-loadbalancetest port=$port target_session_attrs=any check_all_addrs=1",
+	"target_session_attrs=any connects to the first node",
+	sql => "SELECT 'connect1'",
+	log_like => [qr/statement: SELECT 'connect1'/]);
+$node_standby->connect_ok(
+	"host=pg-loadbalancetest port=$port target_session_attrs=standby check_all_addrs=1",
+	"target_session_attrs=standby connects to the third node",
+	sql => "SELECT 'connect1'",
+	log_like => [qr/statement: SELECT 'connect1'/]);
+$node_standby->connect_ok(
+	"host=pg-loadbalancetest port=$port target_session_attrs=read-only check_all_addrs=1",
+	"target_session_attrs=read-only connects to the third node",
+	sql => "SELECT 'connect1'",
+	log_like => [qr/statement: SELECT 'connect1'/]);
+
+$node_primary2->stop();
+$node_standby->stop();
+
+
+done_testing();
diff --git a/src/interfaces/libpq/t/007_load_balance_dns_check_all_addrs.pl b/src/interfaces/libpq/t/007_load_balance_dns_check_all_addrs.pl
new file mode 100644
index 00000000000..d3405598e67
--- /dev/null
+++ b/src/interfaces/libpq/t/007_load_balance_dns_check_all_addrs.pl
@@ -0,0 +1,128 @@
+# Copyright (c) 2023-2025, PostgreSQL Global Development Group
+use strict;
+use warnings FATAL => 'all';
+use Config;
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+if (!$ENV{PG_TEST_EXTRA} || $ENV{PG_TEST_EXTRA} !~ /\bload_balance\b/)
+{
+	plan skip_all =>
+	  'Potentially unsafe test load_balance not enabled in PG_TEST_EXTRA';
+}
+
+my $can_bind_to_127_0_0_2 =
+  $Config{osname} eq 'linux' || $PostgreSQL::Test::Utils::windows_os;
+
+# Checks for the requirements for testing load balancing method 2
+if (!$can_bind_to_127_0_0_2)
+{
+	plan skip_all => 'load_balance test only supported on Linux and Windows';
+}
+
+my $hosts_path;
+if ($windows_os)
+{
+	$hosts_path = 'c:\Windows\System32\Drivers\etc\hosts';
+}
+else
+{
+	$hosts_path = '/etc/hosts';
+}
+
+my $hosts_content = PostgreSQL::Test::Utils::slurp_file($hosts_path);
+
+my $hosts_count = () =
+  $hosts_content =~ /127\.0\.0\.[1-3] pg-loadbalancetest/g;
+if ($hosts_count != 3)
+{
+	# Host file is not prepared for this test
+	plan skip_all => "hosts file was not prepared for DNS load balance test";
+}
+
+$PostgreSQL::Test::Cluster::use_tcp = 1;
+$PostgreSQL::Test::Cluster::test_pghost = '127.0.0.1';
+
+my $port = PostgreSQL::Test::Cluster::get_free_port();
+local $Test::Builder::Level = $Test::Builder::Level + 1;
+my $node_primary1 = PostgreSQL::Test::Cluster->new("primary1", port => $port);
+$node_primary1->init(has_archiving => 1, allows_streaming => 1);
+
+# Start it
+$node_primary1->start();
+
+# Take backup from which all operations will be run
+$node_primary1->backup("my_backup");
+
+my $node_standby = PostgreSQL::Test::Cluster->new("standby", port => $port, own_host => 1);
+$node_standby->init_from_backup($node_primary1, "my_backup",
+	has_restoring => 1);
+$node_standby->start();
+
+my $node_primary2 = PostgreSQL::Test::Cluster->new("node1", port => $port, own_host => 1);
+$node_primary2->init();
+$node_primary2->start();
+sub test_target_session_attr {
+	my $target_session_attrs = shift;
+	my $test_num = shift;
+	my $primary1_expect_traffic = shift;
+	my $standby_expeect_traffic = shift;
+	my $primary2_expect_traffic = shift;
+	# Statistically the following loop with load_balance_hosts=random will almost
+	# certainly connect at least once to each of the nodes. The chance of that not
+	# happening is so small that it's negligible: (2/3)^50 = 1.56832855e-9
+	foreach my $i (1 .. 50)
+	{
+		$node_primary1->connect_ok(
+			"host=pg-loadbalancetest port=$port load_balance_hosts=random target_session_attrs=${target_session_attrs} check_all_addrs=1",
+			"repeated connections with random load balancing",
+			sql => "SELECT 'connect${test_num}'");
+	}
+	my $node_primary1_occurrences = () =
+	  $node_primary1->log_content() =~ /statement: SELECT 'connect${test_num}'/g;
+	my $node_standby_occurrences = () =
+	  $node_standby->log_content() =~ /statement: SELECT 'connect${test_num}'/g;
+	my $node_primary2_occurrences = () =
+	  $node_primary2->log_content() =~ /statement: SELECT 'connect${test_num}'/g;
+
+	my $total_occurrences =
+	  $node_primary1_occurrences + $node_standby_occurrences + $node_primary2_occurrences;
+
+	if ($primary1_expect_traffic == 1) {
+		ok($node_primary1_occurrences > 0, "received at least one connection on node1");
+	}else{
+		ok($node_primary1_occurrences == 0, "received at least one connection on node1");
+	}
+	if ($standby_expeect_traffic == 1) {
+		ok($node_standby_occurrences > 0, "received at least one connection on node1");
+	}else{
+		ok($node_standby_occurrences == 0, "received at least one connection on node1");
+	}
+
+	if ($primary2_expect_traffic == 1) {
+		ok($node_primary2_occurrences > 0, "received at least one connection on node1");
+	}else{
+		ok($node_primary2_occurrences == 0, "received at least one connection on node1");
+	}
+
+	ok($total_occurrences == 50, "received 50 connections across all nodes");
+}
+
+test_target_session_attr('any',
+	1, 1, 1, 1);
+test_target_session_attr('read-only',
+	2, 0, 1, 0);
+test_target_session_attr('read-write',
+	3, 1, 0, 1);
+test_target_session_attr('primary',
+	4, 1, 0, 1);
+test_target_session_attr('standby',
+	5, 0, 1, 0);
+
+
+$node_primary1->stop();
+$node_primary2->stop();
+$node_standby->stop();
+
+done_testing();
-- 
2.47.2

Reply via email to