Dear Rajini, Thanks for the feedback.
1) Because "request.timeout.ms" only affects in-flight requests, after the API NetworkClient.ready() is invoked, the connection won't get closed after "request.timeout.ms” hits. Before a) the SocketChannel is connected b) ssl handshake finished c) authentication has finished (sasl) clients cannot invoke NetworkClient.send() to send any request, which means no in-flight request targeting to the connection will be added. 2) I think a default value of 127 seconds make sense, which meets the timeout indirectly specified by the default value of “tcp.syn.retries”. I’ve added this into the KIP proposal. 3) Every time the timeout hits, the timeout value of the next connection try will increase. The timeout will hit iff a connection stays at the `connecting` state longer than the timeout value, as indicated by ClusterConnectionStates.NodeConnectionState. The connection state of a node will change iff `SelectionKey.OP_CONNECT` is detected by `nioSelector.Select()`. The connection state may transit from `connecting` to a) `disconnected` when SocketChannel.finishConnect() throws IOException. b) `connected` when SocketChannel.finishConnect() return TRUE. In other words, the timeout will hit and increase iff the interested SelectionKey.OP_CONNECT doesn't trigger before the timeout arrives, which means, for example, network congestion, failure of the ARP request, packet filtering, routing error, or a silent discard may happen. (I didn’t read the Java NIO source code. Please correct me the case when OP_CONNECT won’t get triggered if I’m wrong) 4) A) Connection timeout dominates both request timeout and API timeout When connection timeout hits, the connection will be closed. The client will be notified either by the responses constructed by NetworkClient or the callbacks attached to the request. As a result, the request failure will be handled before either connection timeout or API timeout arrives. B) Neither request timeout or API timeout dominates connection timeout i) Request timeout: Because request timeout only affects in-flight requests, after the API NetworkClient.ready() is invoked, the connection won't get closed after "request.timeout.ms” hits. Before 1. the SocketChannel is connected 2. SSL handshake finished 3. authentication has finished (SASL) , clients won't be able to invoke NetworkClient.send() to send any request, which means no in-flight request targeting to the connection will be added. ii) API timeout: In AdminClient, API timeout acts by putting a smaller and smaller timeout value to the chain of requests in a same API. After the API timeout hits, the retry logic won't close any connection. In consumer, API timeout acts as a whole by putting a limit to the code block executing time. The retry logic won't close any connection as well. Conclusion: Thanks again for the long feedback and I’m always enjoying them. I’ve supplement the above discussion into the KIP proposal. Please let me know what you think. Best, - Cheng Tan > On Jun 2, 2020, at 3:01 AM, Rajini Sivaram <rajinisiva...@gmail.com> wrote: > > Hi Cheng, > > Not sure if the discussion should move back to the DISCUSS thread. I have a > few questions: > > 1) The KIP motivation says that in some cases `request.timeout.ms` doesn't > timeout connections properly and as a result it takes 127s to detect a > connection failure. This sounds like a bug rather than a limitation of the > current approach. Can you explain the scenarios where this occurs? > > 2) I think the current proposal is to use non-exponential 10s connection > timeout as default with the option to use exponential timeout. So > connection timeouts for every connection attempt will be between 8s and 12s > by default. Is that correct? Should we use a default max timeout to enable > exponential timeout by default since 8s seems rather small? > > 3) What is the scope of `failures` used to determine connection timeout > with exponential timeouts? Will we always use 10s followed by 20s every > time a connection is attempted? > > 4) It will be good if we can include two flows with the relationship > between various timeouts in the KIP. One with a fixed node like a typical > produce/consume request to the leader and another that uses > `leastLoadedNode` like a metadata request. Having the comparison between > the current and proposed behaviour w.r.t all configurable timeouts (the two > new connection timeouts, request timeout, api timeout etc.) will be useful. > > Regards, > > Rajini >