Hi Colin, `AcceptorIdlePercent` indicates the total amount of time the acceptor is inactive and not accepting any connections because it is blocked on Processors. But I agree the name could be improved. There is back pressure at the Java level (which we can't monitor) and back pressure we apply with blocking queues which the metric refers to. Perhaps `AcceptorBlockedPercent`reflects that it is the time that the acceptor is blocked?
As Ismael said, fairness refers to the distribution of time between processing of new connections and processing of existing connections. Thanks, Rajini On Tue, Jan 15, 2019 at 7:56 PM Ismael Juma <ism...@juma.me.uk> wrote: > I think the point is that we distribute the time more fairly between > connection handling and other operations where before we could block on the > TLS handshake for a long time given a large number of connections. > > Ismael > > On Tue, Jan 15, 2019 at 11:39 AM Colin McCabe <cmcc...@apache.org> wrote: > > > Hi Rajini, > > > > Thanks for this. The KIP looks really useful. > > > > > > > > A new metric will be added to track the amount of time Acceptor is > > blocked > > > from accepting connections due to backpressure. This will be a yammer > > > Meter, consistent with other SocketServer metrics. > > > > > > > > > kafka.network:type=Acceptor,name=AcceptorIdlePercent,listener={listenerName} > > > > > > > Hmm. I was a bit confused by this. When the acceptor is not accepting > > connections because there are none coming in, does that count as idle? > > When the acceptor is not accepting connections because the connect rate > is > > being backpressured, does that count as idle? Would it would be more > > intuitive to have a metric titled AcceptorBackpressuredPercent? > > > > Also, I sort of wonder if titling this "Limit incoming connection > > connection rate" or similar would be clearer than "improving fairness." > I > > guess it is unfair that a lot of incoming connections can swamp the > network > > threads right now. But limiting the rate of new connections is unfair to > > people connecting. Overall the goal seems to be usability, not fairness. > > > > best, > > Colin > > > > > > > > On Tue, Jan 15, 2019, at 04:27, Rajini Sivaram wrote: > > > Hi Jan, > > > > > > If the queue of one Processor is full, we move to the next Processor > > > immediately without blocking. So as long as the queue of any Processor > is > > > not full, we accept the connection immediately. If the queue of all > > > Processors are full, we assign a Processor and block until the > connection > > > can be added. There is currently no timeout for this. The PR is here: > > > https://github.com/apache/kafka/pull/6022 > > > > > > Thanks, > > > > > > Rajini > > > > > > On Tue, Jan 15, 2019 at 12:02 PM Jan Filipiak < > jan.filip...@trivago.com> > > > wrote: > > > > > > > > > > > > > > > > The connection queue for Processors will be changed to > > > > ArrayBlockingQueue with a fixed size of 20. Acceptor will use > > round-robin > > > > allocation to allocate each new connection to the next available > > Processor > > > > to which the connection can be added without blocking. If a > Processor's > > > > queue is full, the next Processor will be chosen. If the connection > > queue > > > > on all Processors are full, Acceptor blocks until the connection can > be > > > > added to the selected Processor. No new connections will be accepted > > during > > > > this period. The amount of time Acceptor is blocked can be monitored > > using > > > > the new AcceptorIdlePercent metric. > > > > > > > > So if the queue of one Processor is full, what is the strategy to > move > > > > to the next queue? Are we using offer with a timeout here? How else > can > > > > we make sure that a single slow processor will not block the entire > > > > processing? I assume we do not allow us to get stuck during put as > you > > > > mention that all queues full is a scenario. I think there is quite > some > > > > uncertainty here. Is there any code one could check out? > > > > > > > > Best Jan > > > > > > > > > >