Re: [DISCUSS] KIP-601: Configurable socket connection timeout

2020-05-19 Thread Colin McCabe
On Tue, May 19, 2020, at 03:27, Rajini Sivaram wrote: > Hi Colin, > > I do agree about the `leastLoadedNode` case. My question was about the > other cases where we are connecting to a specific node: fetch requests to > leaders, produce requests to leaders, requests to group coordinators, > request

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

2020-05-19 Thread Rajini Sivaram
Hi Colin, I do agree about the `leastLoadedNode` case. My question was about the other cases where we are connecting to a specific node: fetch requests to leaders, produce requests to leaders, requests to group coordinators, requests to controller etc. It will be good to either quantify that these

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

2020-05-18 Thread Colin McCabe
Hi Rajini, I think the idea behind the 10 second default is that if you have three Kafka nodes A, B, C (or whatever), and you can't talk to A within 10 seconds, you'll try again with B or C, and still have plenty of time left over. Whereas currently, if your connection hangs while trying to co

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

2020-05-15 Thread Cheng Tan
Dear Rajini, Thanks for all the feedbacks. They are very helpful for me to do the brainstorming. I’ve incorporated our discuss in the KIP and started a voting thread. Best, - Cheng Tan > On May 15, 2020, at 2:13 PM, Rajini Sivaram wrote: > > Hi Cheng, > > I am fine with the rest of the KIP

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

2020-05-15 Thread Rajini Sivaram
Hi Cheng, I am fine with the rest of the KIP apart from the 10s default. If no one else has any concerns about this new default, let's go with it. Please go ahead and start vote. Regards, Rajini On Fri, May 15, 2020 at 8:21 PM Cheng Tan wrote: > Dear Rajini, > > > Thanks for the reply. > > >

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

2020-05-15 Thread Cheng Tan
Dear Rajini, Thanks for the reply. > e have a lot of these and I want to > understand the benefits of the proposed timeout in this case alone. We > currently have a request timeout of 30s. Would you consider adding a 10s > connection timeout? A shorter timeout (10s) at the transportation leve

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

2020-05-15 Thread Rajini Sivaram
Hi Cheng, Let me rephrase my question. Let's say we didn't have the case of leastLoadedNode. We are only talking about connections to a specific node (i.e. leader or controller). We have a lot of these and I want to understand the benefits of the proposed timeout in this case alone. We currently h

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

2020-05-14 Thread Cheng Tan
Hi Rajini, Thanks for the reply. > Not sure 10s is a good default because it unnecessarily times out > connections, only to attempt re-connecting to the same broker (except in > the leastLoadedNode case where it would be useful to have a lower timeout). The underlying logic for a connection tu

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

2020-05-14 Thread Rajini Sivaram
Hi Cheng, 1) Thanks for the update, the KIP now says ` socket.connections.setup.timeout.ms*`*, which sounds good. 2) Not sure 10s is a good default because it unnecessarily times out connections, only to attempt re-connecting to the same broker (except in the leastLoadedNode case where it would

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

2020-05-13 Thread Cheng Tan
Hi Rajini, Thanks for the comments. > I think > they started off as connection timeouts but now include authentication time > as well. Have we considered using similar configs for this case? The new config I proposed is focusing on the connections to unreachable servers. The timeout count won

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

2020-05-13 Thread Rajini Sivaram
Hi Cheng, Thanks for the KIP, sounds like a good improvement. A couple of comments: 1) We currently have client connection timeouts on the broker with configs named `xxx.socket.timeout.ms` (e.g. controller.socket.timeout.ms). I think they started off as connection timeouts but now include authent

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

2020-05-07 Thread Jose Garcia Sancio
Cheng, Thanks for the KIP and the detailed proposal section. LGTM! On Thu, May 7, 2020 at 3:38 PM Cheng Tan wrote: > > I think more about the potential wider use cases, I modified the proposal to > target all the connection. Thanks. > > - Best, - Cheng Tan > > > On May 7, 2020, at 1:41 AM, Chen

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

2020-05-07 Thread Cheng Tan
I think more about the potential wider use cases, I modified the proposal to target all the connection. Thanks. - Best, - Cheng Tan > On May 7, 2020, at 1:41 AM, Cheng Tan wrote: > > Hi Colin, > > Sorry for the confusion. I’m proposing to implement timeout in the > NetworkClient.leastLoadedN

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

2020-05-07 Thread Cheng Tan
Hi Colin, Sorry for the confusion. I’m proposing to implement timeout in the NetworkClient.leastLoadedNode() when iterating all the cached node. The alternative I can think is to implement the timeout in NetworkClient.poll() I’d prefer to implement in the leastLoadedNode(). Here’re the reasons

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

2020-05-04 Thread Colin McCabe
Hi Cheng, On the KIP page, it lists this KIP as "draft." It seems like "under discussion" is appropriate here, right? > Currently, the initial socket connection timeout is depending on Linux kernel > setting > tcp_syn_retries. The timeout value is 2 ^ (tcp_sync_retries + 1) - 1 seconds.  > For

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

2020-05-04 Thread Colin McCabe
Hmm. A big part of the reason behind the KIP is that the default connection timeout behavior of the OS doesn't work for Kafka, right? For example, on Linux, if we wait 127 seconds for a connection attempt to time out, we won't get a chance to make another attempt in most cases. So I think it

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

2020-05-04 Thread Jose Garcia Sancio
Thanks for the KIP Cheng, > The default value will be 10 seconds. I think we should make the default the current behavior. Meaning the default should leverage the default connect timeout from the operating system. > Proposed Changes I don't fully understand this section. It seems like it is mai

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

2020-05-01 Thread Cheng Tan
Hi Colin. Thanks for the discussion and feedback. I re-wrote the KIP-601 proposal following your suggestions. Now the new proposal is ready. Best, - Cheng Tan > On Apr 28, 2020, at 2:55 PM, Colin McCabe wrote: > > > Thanks again for the KIP. This seems like it has been a gap in Kaf

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

2020-04-28 Thread Colin McCabe
Hi Cheng, Thanks for the KIP. > Currently, the initial socket connection timeout is depending on system > setting tcp_syn_retries. The actual timeout value is 2 ^ (tcp_sync_retries + > 1) - 1 seconds This section is confusing since it refers to Linux configuration settings without mentioning

[DISCUSS] KIP-601: Configurable socket connection timeout

2020-04-27 Thread Cheng Tan
Hi developers, I’m proposing KIP-601 to support configurable socket connection timeout. https://cwiki.apache.org/confluence/display/KAFKA/KIP-601%3A+Configurable+socket+connection+timeout Curre