Evgeny Kuzin <[email protected]> writes:
> We've been running into an issue with "target_session_attrs" when using
> dns-based service discovery. Currently, when libpq connects to a host with
> multiple A-records and the connection succeeds but is rejected due to
> target_session_attrs mismatch (e.g., connecting to a read-only server with
> target_session_attrs=read-write), it skips all remaining addresses for that
> hostname and moves directly to the next host in the connection string.
> Looking at git history, I found this was a deliberate choice by Robert Haas
> in commit 721f7bd3cbc (2016), where he noted "I changed Mithun's patch to
> skip all remaining IPs for a host if we reject a connection based on this new
> parameter." The original mailing list discussion is at [1], though I wasn't
> able to find a clear explanation of why this approach was preferred over
> trying all addresses.
> This makes it impractical to use a single multi-A-record DNS name pointing to
> all cluster members with target_session_attrs=read-write to find the primary
> - only the first responding IP is tried before giving up on that hostname.
> The attached patch changes the behavior to try all addresses for a hostname
> before moving to the next host, matching the existing behavior for connection
> failures. This would enable simpler DNS-based service discovery without
> requiring external tools like Consul or explicit multi-host connection
> strings.
TBH, I'd say that your DNS setup is broken and you should fix it.
It makes no sense to have the same DNS entry pointing to both
read-write and read-only hosts. The proposed patch will mainly
result in useless connection attempts in more-sanely-constructed
setups.
regards, tom lane