Cheng Tan created KAFKA-9893:
--------------------------------

             Summary: Configurable TCP connection timeout for AdminClient
                 Key: KAFKA-9893
                 URL: https://issues.apache.org/jira/browse/KAFKA-9893
             Project: Kafka
          Issue Type: New Feature
            Reporter: Cheng Tan


We do not currently allow for connection timeouts to be defined within 
AdminClient, and as a result rely on the default OS settings to determine 
whether a broker is inactive before selecting an alternate broker from 
bootstrap.

In the case of a connection timeout on initial handshake, and where 
tcp_syn_retries is the default (6), we won't timeout an unresponsive broker 
until ~127s - while the client will timeout sooner (~120s).

Reducing tcp_syn_retries should mitigate the issue depending on the number of 
unresponsive brokers within the bootstrap, though this will be applied system 
wide, and it would be good if we could instead configure connection timeouts 
for AdminClient.

The use case where this came up was a customer performing DC failover tests 
with a stretch cluster.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to