I don’t see where sub millisecond TTL would be helpful, given that DNS itself only features seconds (and if you really need high speed queries, go with the DNS API).
Blocking all requests behind one lock also looks like a reasonable thing, I don’t see much possible improvements other than maybe a timeout option for the blocked resolves and maybe a retry “expirery”.. however that sounds more useful for DNS queries than for system resolver calls.
BTW an application with special needs always can do the caching and resolving itself, for example a guava cache with async loader will also avoid larger queues as it loads single threaded and it can refresh the cache independent of in flight queries (if once resolved). I don’t think the JDK can be flexible enough to support all those advanced use cases like yours,
Gruß,Bernd
--
https://bernd.eckenfels.net
Von: net-dev <net-dev-r...@openjdk.org> im Auftrag von Spurling, John <sp...@amazon.com>
Gesendet: Freitag, August 1, 2025 5:30 PM
An: net-dev@openjdk.org <net-dev@openjdk.org>
Betreff: Handling multiple in-flight InetAddress.getByName queries for the same name at scale
Large, complicated DNS infrastructure setups coupled with varying application behavior, failure recovery scenarios, and long lived, large scale systems that are slow to migrate can lead to surprising, undesirable behavior when interacting with java.net.InetAddress's implementation of name resolution.
Multiple in-flight requests for the same host using InetAddress.NameServiceAddresses [0] are serialized on a lock [1]. The first request calls the system resolver [2]. Each additional request made before the resolver returns with a response shares the same instance of a NameServiceAddresses, each blocking in InetAdress.NameServiceAddresses.get until it gets the lock. Once all in-flight queries have completed, the InetAddress.NameServiceAddresses dies.
One potential difficulty lies in when the resolver takes longer to complete than the frequency at which name resolution requests are made. Normally, we expect subsequent name resolution requests to take less time than the first request, but that might not be the case. When it isn't, failures can become amplified due to the serialization of all queries on the same host.
Imagine that your DNS infrastructure requires that you disable the networkaddress caches since the minimum enabled value is 1 second, which is too large. If the system resolver takes 1 second, and subsequent resolver calls also take 1 second, and queries are made at a rate of more than one a second, each query will take longer and longer, since each call needs to wait for the resolver to complete, plus the time it spends waiting for the lock. One can see how this might lead to service outages.
There are multiple potential ways to deal with the NameServiceAddresses lock.
One way is to provide finer grained expiry for the cache. Instead of providing the cache timeouts in integer seconds, provide some hook to provide them in milliseconds. Leaving caching enabled but providing smaller timeouts could help provide the right level of control to avoid some of these complicated issues at scale.
Another way is to simply have all in-flight queries return the same result. NameServiceAddresses instances would die more quickly: only the first call to getAddressesFromNameService would be made, the result would be cached in the object. Each subsequent call made during that time would quickly return the result cached in the NameServiceAddresses instance, and the instance would quickly die. New getByName requests would create NameServiceAddresses instances as usual.
Other larger scale rewrites are also possible, but given how battle-tested the current implementation is and the potential dangers here, I imagine something smaller might be more acceptable.
Thoughts?
-john
[0] https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/net/InetAddress.java#L1689C8-L1695C15
[1] https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/net/InetAddress.java#L1043
[2] https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/net/InetAddress.java#L1060