Hi Cheng, Let me rephrase my question. Let's say we didn't have the case of leastLoadedNode. We are only talking about connections to a specific node (i.e. leader or controller). We have a lot of these and I want to understand the benefits of the proposed timeout in this case alone. We currently have a request timeout of 30s. Would you consider adding a 10s connection timeout? And if you did, what would you expect the 10s timeout to do?
a) We could fail a request if connection didn't complete within 10s. If we always expect connections to succeed within 10s, this would be considered reasonable behaviour. But this would be changing the current default, which allows you up to 30 seconds to connect and process a request. b) We retry the connection. What would be the point? We were waiting in a queue for connecting, but we decide to stop and join the back of the queue. We have KIP-612 that is proposing to throttle connection set up on the one hand and this KIP that is dramatically reducing default connection timeout on the other. Not sure if that is a good idea. On Fri, May 15, 2020 at 1:26 AM Cheng Tan <c...@confluent.io> wrote: > Hi Rajini, > > Thanks for the reply. > > > Not sure 10s is a good default because it unnecessarily times out > > connections, only to attempt re-connecting to the same broker (except in > > the leastLoadedNode case where it would be useful to have a lower > timeout). > > > The underlying logic for a connection turn from “connecting” to > “connected” is finishing the TCP three-way handshake (done by > socketChannel.connect()). So we can say the 10 seconds timeout is for the > transportation layer three-way handshake. > > The request timeout includes more besides the handshakes, such as > authentication, server processing the request, and server sending back the > response. > > Thus I think setting the default of socket.connection.setup.timeout.ms a > smaller value than request.timeout.ms is reasonable. Do you have any > specific reason in mind that 10s is too short? > > Best, -Chang Tan > > > > 在 2020年5月14日,上午6:44,Rajini Sivaram <rajinisiva...@gmail.com> 写道: > > > > Hi Cheng, > > > > 1) Thanks for the update, the KIP now says ` > > socket.connections.setup.timeout.ms*`*, which sounds good. > > > > 2) Not sure 10s is a good default because it unnecessarily times out > > connections, only to attempt re-connecting to the same broker (except in > > the leastLoadedNode case where it would be useful to have a lower > timeout). > > Couldn't we just use 30s (request timeout) as the default value, > retaining > > the current behaviour? Or alternatively, only apply the timeout where it > > makes sense to have a lower timeout (i.e. a timeout for leastLoadedNode)? > > > > Regards, > > > > Rajini > > > >> On Thu, May 14, 2020 at 5:00 AM Cheng Tan <c...@confluent.io> wrote: > >> > >> Hi Rajini, > >> > >> Thanks for the comments. > >> > >>> I think > >>> they started off as connection timeouts but now include authentication > >> time > >>> as well. Have we considered using similar configs for this case? > >> > >> > >> The new config I proposed is focusing on the connections to unreachable > >> servers. The timeout count won’t involved the authentication. I think > >> “socket.” prefix make sense. I’ll change it. Colin mentioned that adding > >> the key word “setup” can help people understand that this config only > >> applies to the connection setup. What about “socket.connection.setup.ms > ”? > >> > >>> The KIP proposes 10s as the default. What does this mean for typical > >>> connections like a produce request going to the leader? > >> > >> The new timeout config applies to each connection. It’s at the > >> NetworkClient level and won’t consider the underlying connection logic. > >> Specifically, by default, every connection will have 10 seconds to > become > >> “connected” from “connecting”, which implies the corresponding socket > >> channel is now connected (SocketChanel.finishConnect() returns True), no > >> matter what the request logic and abstraction is. > >> > >> Please let me know if these make sense to you or if you have more > >> thoughts. Thank you. > >> > >> Best, -Cheng > >> > >>>> 在 2020年5月13日,上午7:11,Rajini Sivaram <rajinisiva...@gmail.com> 写道: > >>> > >>> Hi Cheng, > >>> > >>> Thanks for the KIP, sounds like a good improvement. A couple of > comments: > >>> > >>> 1) We currently have client connection timeouts on the broker with > >> configs > >>> named `xxx.socket.timeout.ms` (e.g. controller.socket.timeout.ms). I > >> think > >>> they started off as connection timeouts but now include authentication > >> time > >>> as well. Have we considered using similar configs for this case? We may > >>> want to prefix the new config with `socket.` anyway - something along > the > >>> lines of `socket.connection.timeout.ms` if it is just the connection > >> time. > >>> > >>> 2) The KIP proposes 10s as the default. What does this mean for typical > >>> connections like a produce request going to the leader? Instead of one > >>> connection attempt to the leader, we want three separate connection > >>> attempts within the request timeout to the leader? > >>> > >>> Regards, > >>> > >>> Rajini > >>> > >>> > >>>> On Thu, May 7, 2020 at 11:51 PM Jose Garcia Sancio < > >> jsan...@confluent.io> > >>>> wrote: > >>>> Cheng, > >>>> Thanks for the KIP and the detailed proposal section. LGTM! > >>>>>> On Thu, May 7, 2020 at 3:38 PM Cheng Tan <c...@confluent.io> wrote: > >>>>>> I think more about the potential wider use cases, I modified the > >>>> proposal to target all the connection. Thanks. > >>>>> - Best, - Cheng Tan > >>>>>> On May 7, 2020, at 1:41 AM, Cheng Tan <c...@confluent.io> wrote: > >>>>>> Hi Colin, > >>>>>> Sorry for the confusion. I’m proposing to implement timeout in the > >>>> NetworkClient.leastLoadedNode() when iterating all the cached node. > The > >>>> alternative I can think is to implement the timeout in > >> NetworkClient.poll() > >>>>>> I’d prefer to implement in the leastLoadedNode(). Here’re the > reasons: > >>>>>> Usually when clients send a request, they will asking the network > >>>> client to send the request to a specific node. In this case, the > >>>> connection.setup.timeout won’t matter too much because the client > >> doesn’t > >>>> want to try other nodes for that specific request. The request level > >>>> timeout would be enough. The metadata fetcher fetches the nodes status > >>>> periodically so the clients can reassign the request to another node > >> after > >>>> timeout. > >>>>>> Consumer, producer, and AdminClient are all using leastLoadedNode() > >>>> for metadata fetch, where the connection setup timeout can play an > >>>> important role. Unlike other requests can refer to the metadata for > node > >>>> condition, the metadata requests can only blindly choose a node for > >> retry > >>>> in the worst scenario. We want to make sure the client can get the > >> metadata > >>>> smoothly and as soon as possible. As a result, we need this > >>>> connection.setup.timeout. > >>>>>> Implementing the timeout in poll() or anywhere else might need an > >>>> extra iteration of all nodes, which might downgrade the network client > >>>> performance. > >>>>>> I also updated the KIP content and KIP status. Please let me know if > >>>> the above ideas make sense. Thanks. > >>>>>> Best, - Cheng Tan > >>>>>>> On May 4, 2020, at 5:26 PM, Colin McCabe <cmcc...@apache.org > >> <mailto: > >>>> cmcc...@apache.org>> wrote: > >>>>>>> Hi Cheng, > >>>>>>> On the KIP page, it lists this KIP as "draft." It seems like > "under > >>>> discussion" is appropriate here, right? > >>>>>>>> Currently, the initial socket connection timeout is depending on > >>>> Linux kernel setting > >>>>>>>> tcp_syn_retries. The timeout value is 2 ^ (tcp_sync_retries + 1) > - 1 > >>>> seconds. For the > >>>>>>>> reasons below, we want to control the client-side socket timeout > >>>> directly using > >>>>>>>> configuration files > >>>>>>> Linux is just one example of an OS that Kafka could run on, right? > >>>> You could also be running on MacOS, for example. > >>>>>>>> I'm proposing to do a lazy socket connection time out. That is, we > >>>> only check if > >>>>>>>> we need to timeout a socket when we consider the corresponding > node > >>>> as a > >>>>>>>> candidate in the node provider. > >>>>>>> The NodeProvider is an AdminClient abstraction, right? Why > wouldn't > >>>> we implement a connection setup timeout for all clients, not just > >>>> AdminClient? > >>>>>>> best, > >>>>>>> Colin > >>>>>>> On Mon, May 4, 2020, at 13:18, Colin McCabe wrote: > >>>>>>>> Hmm. A big part of the reason behind the KIP is that the default > >>>>>>>> connection timeout behavior of the OS doesn't work for Kafka, > right? > >>>>>>>> For example, on Linux, if we wait 127 seconds for a connection > >>>> attempt > >>>>>>>> to time out, we won't get a chance to make another attempt in most > >>>>>>>> cases. So I think it makes sense to set a shorter default. > >>>>>>>> best, > >>>>>>>> Colin > >>>>>>>> On Mon, May 4, 2020, at 09:44, Jose Garcia Sancio wrote: > >>>>>>>>> Thanks for the KIP Cheng, > >>>>>>>>>> The default value will be 10 seconds. > >>>>>>>>> I think we should make the default the current behavior. Meaning > >> the > >>>>>>>>> default should leverage the default connect timeout from the > >>>> operating > >>>>>>>>> system. > >>>>>>>>>> Proposed Changes > >>>>>>>>> I don't fully understand this section. It seems like it is mainly > >>>>>>>>> focused on the problem with the current implementation. Can you > >>>>>>>>> explain how the proposed changes solve the problem? > >>>>>>>>> Thanks. > >>>>>>>>> -- > >>>>>>>>> -Jose > >>>> -- > >>>> -Jose > >> >