On Wed, 10 May 2023 21:58:52 GMT, Sergey Bylokhov <[email protected]> wrote:

>> I would like to get preliminary feedback about the provided patch.
>> 
>> Discussion on net-dev@ 
>> https://mail.openjdk.org/pipermail/net-dev/2023-March/020682.html
>> 
>> One of the main issue I try to solve is how the cache handle the 
>> intermittent DNS server outages due to overloading or network connection.
>> 
>> At the moment this cache can be configured by the application using the 
>> following two properties:
>>    (1) "networkaddress.cache.ttl"(30 sec) - cache policy for successful 
>> lookups
>>    (2) "networkaddress.cache.negative.ttl"(10 sec) - cache policy for 
>> negative lookups
>> 
>> The default timeout for positive responses is good enough to "have recent 
>> dns-records" and to "minimize the number of requests to the DNS server".
>> 
>> But the cache for the negative responses is problematic. This is a problem I 
>> would like to solve. Caching the negative response means that for **10** 
>> seconds the application will not be able to connect to the server.
>> 
>> Possible solutions:
>>   1. Decreasing timeout "for the negative responses": unfortunately more 
>> requests to the server at the moment of "DNS-outage" cause even more issues, 
>> since this is not the right moment to load the network/server more.
>>   2. Increasing timeout "for the positive responses": this will decrease the 
>> chance to get an error, but the cache will start to use stale data longer.
>>   3. This proposal: it would be good to ignore the negative response and 
>> continue to use the result of the last "successful lookup" until some 
>> additional timeout.
>> 
>> The idea is to split the notion of the TTL and the timeout used for the 
>> cache. When TTL for the record will expire we should request the new data 
>> from the server. If this request goes fine we will update the record, if it 
>> fails we will continue to use the cached date until the next sync.
>> 
>> For example, if the new property "networkaddress.cache.extended.ttl" is set 
>> to 10 minutes, then we will cache a positive response for 10 minutes but 
>> will try to sync it every 30 seconds. If the new property is not set then as 
>> before we will cache positive for 30 seconds and then cache the negative 
>> response for 10 seconds.
>> 
>> 
>> RFC 8767 Serving Stale Data to Improve DNS Resiliency:
>> https://www.rfc-editor.org/rfc/rfc8767
>> 
>> Comments about current and other possible implementations:
>>  * The code intentionally moved to the separate ValidAddresses class, just 
>> to make clear that the default configuration, when the new property is not 
>> set is not changed much.
>>  * The refr...
>
> Sergey Bylokhov has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Apply suggestions from code review
>   
>   Co-authored-by: Michael McMahon 
> <[email protected]>

src/java.base/share/classes/java/net/InetAddress.java line 215:

> 213:  * it were unsuccessful. This property is useful if it is preferable to 
> use a
> 214:  * stale name rather than result of unsuccessful name lookup. The default
> 215:  * setting is to cache for an implementation specific period of time.

AFAICS the default setting is to attempt to refresh stale data immediately and 
not cache them.

src/java.base/share/classes/java/net/InetAddress.java line 966:

> 964:      * addresses or invalid (ie a failed lookup) containing no addresses.
> 965:      */
> 966:     private static class CachedLookup implements Addresses, 
> Comparable<CachedLookup> {

I do not really like the fact that CachedAddress -> CachedLookup has 
transformed an immutable class into a mutable one, which in addition forces a 
volatile read to get the addresses.

Have you explored moving the new functionality implemented by 
`ValidCachedLookup` inside `NameServiceAddresses`?

Couldn't NameServiceAddresses be initialized with a (final) InetAddress[] array 
(from the previous stale lookup) that, if not null, would be returned when 
`tryLock` returns false?

In other words if NameServiceAddresses::staleAddresses is not null, and we are 
still in the stale window, use tryLock, and if false, return staleAddresses; 
otherwise, use lock.

Would that work, and keep everything final instead of volatile?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/13285#discussion_r1196558484
PR Review Comment: https://git.openjdk.org/jdk/pull/13285#discussion_r1196639624

Reply via email to