On Tue, Jan 15, 2019, at 12:59, Rajini Sivaram wrote: > Hi Colin, > > `AcceptorIdlePercent` indicates the total amount of time the acceptor is > inactive and not accepting any connections because it is blocked on > Processors. But I agree the name could be improved. There is back pressure > at the Java level (which we can't monitor) and back pressure we apply with > blocking queues which the metric refers to. Perhaps > `AcceptorBlockedPercent`reflects > that it is the time that the acceptor is blocked?
Yeah, I like "AcceptorBlockedPercent" better. > > As Ismael said, fairness refers to the distribution of time between > processing of new connections and processing of existing connections. OK. best, Colin > > Thanks, > > Rajini > > > On Tue, Jan 15, 2019 at 7:56 PM Ismael Juma <ism...@juma.me.uk> wrote: > > > I think the point is that we distribute the time more fairly between > > connection handling and other operations where before we could block on the > > TLS handshake for a long time given a large number of connections. > > > > Ismael > > > > On Tue, Jan 15, 2019 at 11:39 AM Colin McCabe <cmcc...@apache.org> wrote: > > > > > Hi Rajini, > > > > > > Thanks for this. The KIP looks really useful. > > > > > > > > > > > A new metric will be added to track the amount of time Acceptor is > > > blocked > > > > from accepting connections due to backpressure. This will be a yammer > > > > Meter, consistent with other SocketServer metrics. > > > > > > > > > > > > > kafka.network:type=Acceptor,name=AcceptorIdlePercent,listener={listenerName} > > > > > > > > > > Hmm. I was a bit confused by this. When the acceptor is not accepting > > > connections because there are none coming in, does that count as idle? > > > When the acceptor is not accepting connections because the connect rate > > is > > > being backpressured, does that count as idle? Would it would be more > > > intuitive to have a metric titled AcceptorBackpressuredPercent? > > > > > > Also, I sort of wonder if titling this "Limit incoming connection > > > connection rate" or similar would be clearer than "improving fairness." > > I > > > guess it is unfair that a lot of incoming connections can swamp the > > network > > > threads right now. But limiting the rate of new connections is unfair to > > > people connecting. Overall the goal seems to be usability, not fairness. > > > > > > best, > > > Colin > > > > > > > > > > > > On Tue, Jan 15, 2019, at 04:27, Rajini Sivaram wrote: > > > > Hi Jan, > > > > > > > > If the queue of one Processor is full, we move to the next Processor > > > > immediately without blocking. So as long as the queue of any Processor > > is > > > > not full, we accept the connection immediately. If the queue of all > > > > Processors are full, we assign a Processor and block until the > > connection > > > > can be added. There is currently no timeout for this. The PR is here: > > > > https://github.com/apache/kafka/pull/6022 > > > > > > > > Thanks, > > > > > > > > Rajini > > > > > > > > On Tue, Jan 15, 2019 at 12:02 PM Jan Filipiak < > > jan.filip...@trivago.com> > > > > wrote: > > > > > > > > > > > > > > > > > > > > The connection queue for Processors will be changed to > > > > > ArrayBlockingQueue with a fixed size of 20. Acceptor will use > > > round-robin > > > > > allocation to allocate each new connection to the next available > > > Processor > > > > > to which the connection can be added without blocking. If a > > Processor's > > > > > queue is full, the next Processor will be chosen. If the connection > > > queue > > > > > on all Processors are full, Acceptor blocks until the connection can > > be > > > > > added to the selected Processor. No new connections will be accepted > > > during > > > > > this period. The amount of time Acceptor is blocked can be monitored > > > using > > > > > the new AcceptorIdlePercent metric. > > > > > > > > > > So if the queue of one Processor is full, what is the strategy to > > move > > > > > to the next queue? Are we using offer with a timeout here? How else > > can > > > > > we make sure that a single slow processor will not block the entire > > > > > processing? I assume we do not allow us to get stuck during put as > > you > > > > > mention that all queues full is a scenario. I think there is quite > > some > > > > > uncertainty here. Is there any code one could check out? > > > > > > > > > > Best Jan > > > > > > > > > > > > > > >