I think the point is that we distribute the time more fairly between connection handling and other operations where before we could block on the TLS handshake for a long time given a large number of connections.
Ismael On Tue, Jan 15, 2019 at 11:39 AM Colin McCabe <cmcc...@apache.org> wrote: > Hi Rajini, > > Thanks for this. The KIP looks really useful. > > > > > A new metric will be added to track the amount of time Acceptor is > blocked > > from accepting connections due to backpressure. This will be a yammer > > Meter, consistent with other SocketServer metrics. > > > > > kafka.network:type=Acceptor,name=AcceptorIdlePercent,listener={listenerName} > > > > Hmm. I was a bit confused by this. When the acceptor is not accepting > connections because there are none coming in, does that count as idle? > When the acceptor is not accepting connections because the connect rate is > being backpressured, does that count as idle? Would it would be more > intuitive to have a metric titled AcceptorBackpressuredPercent? > > Also, I sort of wonder if titling this "Limit incoming connection > connection rate" or similar would be clearer than "improving fairness." I > guess it is unfair that a lot of incoming connections can swamp the network > threads right now. But limiting the rate of new connections is unfair to > people connecting. Overall the goal seems to be usability, not fairness. > > best, > Colin > > > > On Tue, Jan 15, 2019, at 04:27, Rajini Sivaram wrote: > > Hi Jan, > > > > If the queue of one Processor is full, we move to the next Processor > > immediately without blocking. So as long as the queue of any Processor is > > not full, we accept the connection immediately. If the queue of all > > Processors are full, we assign a Processor and block until the connection > > can be added. There is currently no timeout for this. The PR is here: > > https://github.com/apache/kafka/pull/6022 > > > > Thanks, > > > > Rajini > > > > On Tue, Jan 15, 2019 at 12:02 PM Jan Filipiak <jan.filip...@trivago.com> > > wrote: > > > > > > > > > > > > The connection queue for Processors will be changed to > > > ArrayBlockingQueue with a fixed size of 20. Acceptor will use > round-robin > > > allocation to allocate each new connection to the next available > Processor > > > to which the connection can be added without blocking. If a Processor's > > > queue is full, the next Processor will be chosen. If the connection > queue > > > on all Processors are full, Acceptor blocks until the connection can be > > > added to the selected Processor. No new connections will be accepted > during > > > this period. The amount of time Acceptor is blocked can be monitored > using > > > the new AcceptorIdlePercent metric. > > > > > > So if the queue of one Processor is full, what is the strategy to move > > > to the next queue? Are we using offer with a timeout here? How else can > > > we make sure that a single slow processor will not block the entire > > > processing? I assume we do not allow us to get stuck during put as you > > > mention that all queues full is a scenario. I think there is quite some > > > uncertainty here. Is there any code one could check out? > > > > > > Best Jan > > > > > >