Hi Rajini,

Thanks for the comments. Pleas see the reply inline.

Hi Jun,

Thinking about the consumer rebalance case a bit more, I am not sure if we
need to worry about the delayed rebalance due to quota violation or not.
The rebalance actually uses a separate channel to coordinator. Therefore
unless the request quota is hit, the rebalance won't be throttled. Even if
request quota is hit, it seems unlikely to be delayed long. So it looks
that we don't need to unmute the channel earlier than needed. Does this
address your concern?

Thanks,

Jiangjie (Becket) Qin

On Mon, Jan 15, 2018 at 4:22 AM, Rajini Sivaram <rajinisiva...@gmail.com>
wrote:

> Hi Becket,
>
> A few questions:
>
> 1. KIP says: *Although older client implementations (prior to knowledge of
> this KIP) will immediately send the next request after the broker responds
> without paying attention to the throttle time field, the broker is
> protected by virtue of muting the channel for time X. i.e., the next
> request will not be processed until the channel is unmuted. *
> For fetch requests, the response is sent immediately and the mute time on
> the broker based on empty fetch response will often be zero (unlike the
> throttle time returned to the client based on non-empty response). Won't
> that lead to a tight loop of fetch requests from old clients (particularly
> expensive with SSL)? Wouldn't it be better to retain current behaviour for
> old clients? Also, if we change the behaviour for old clients, client
> metrics that track throttle time will report a lot of throttle unrelated to
>  actual throttle time.
>
For consumers, if quota is violated, the throttle time on the broker will
not be 0. It is just that the throttle time will not be increasing because
the consumer will return an empty response in this case. So there should
not be a tight loop.


> 2. KIP says: *The usual idle timeout i.e., connections.max.idle.ms
> <http://connections.max.idle.ms> will still be honored during the throttle
> time X. This makes sure that the brokers will detect client connection
> closure in a bounded time.*
> Wouldn't it be better to bound maximum throttle time to
> *connections.max.idle.ms
> <http://connections.max.idle.ms>*? If we mute for a time greater than
> *connections.max.idle.ms
> <http://connections.max.idle.ms>* and then close a client connection
> simply
> because we had muted it on the broker for a longer throttle time, we force
> a reconnection and read the next request from the new connection straight
> away. This feels the same as a bound on the quota of *
> connections.max.idle.ms
> <http://connections.max.idle.ms>*, but with added load on the broker for
> authenticating another connection (expensive with SSL).
>
I think we need to think about the consumer prior to and after this KIP
separately.

For consumer prior to this KIP, even if we unmute the channel after
connection.max.idle.ms, it is likely that the consumers have already
reached request.timeout.ms before that and has reconnected to the broker.
So there is no real difference whether we close the throttled channel or
not.

For consumers after the KIP, because they will honor the throttle time,
they will back off until throttle time is reached. If that throttle time is
longer than connection.max.idle.ms, it seems not a big overhead because
there will only be one connection re-establishment in quite a few minutes.
Compared with such overhead, it seems more important to honor the quota so
the broker can survive.


> 3. Are we changing the behaviour of network bandwidth quota for
> Produce/Fetch and retaining current behaviour for request quotas?
>
This is going to be applied to all the quota. Including request quotas.


>
> Thanks,
>
> Rajini
>
>
>
> On Wed, Jan 10, 2018 at 10:29 PM, Jun Rao <j...@confluent.io> wrote:
>
> > Hi, Jiangjie,
> >
> > Thanks for the updated KIP. +1
> >
> > Jun
> >
> > On Mon, Jan 8, 2018 at 7:45 PM, Becket Qin <becket....@gmail.com> wrote:
> >
> > > Thanks for the comments, Jun.
> > >
> > > 1. Good point.
> > > 2. Also makes sense. Usually the connection.max.idle.ms is high enough
> > so
> > > the throttling is impacted.
> > >
> > > I have updated the KIP to reflect the changes.
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > >
> > > On Mon, Jan 8, 2018 at 6:30 PM, Jun Rao <j...@confluent.io> wrote:
> > >
> > > > Hi, Jiangjie,
> > > >
> > > > Sorry for the late response. The proposal sounds good overall. A
> couple
> > > of
> > > > minor comments below.
> > > >
> > > > 1. For throttling a fetch request, we could potentially just send an
> > > empty
> > > > response. We can return a throttle time calculated from a full
> > response,
> > > > but only mute the channel on the server based on a throttle time
> > > calculated
> > > > based on the empty response. This has the benefit that the server
> will
> > > mute
> > > > the channel much shorter, which will prevent the consumer from
> > > rebalancing
> > > > when throttled.
> > > >
> > > > 2. The wiki says "connections.max.idle.ms should be ignored during
> the
> > > > throttle time X." This has the potential issue that a server may not
> > > detect
> > > > that a client connection is already gone until after an arbitrary
> > amount
> > > of
> > > > time. Perhaps we could still just close a connection if the server
> has
> > > > muted it for longer than connections.max.idle.ms. This will at least
> > > bound
> > > > the time for a server to detect closed client connections.
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > >
> > > > On Mon, Nov 20, 2017 at 5:30 PM, Becket Qin <becket....@gmail.com>
> > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > We would like to start the voting thread for KIP-219. The KIP
> > proposes
> > > to
> > > > > improve the quota communication between the brokers and clients,
> > > > especially
> > > > > for cases of long throttling time.
> > > > >
> > > > > The KIP wiki is following:
> > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > 219+-+Improve+quota+
> > > > > communication
> > > > >
> > > > > The discussion thread is here:
> > > > > http://markmail.org/search/?q=kafka+KIP-219#query:kafka%
> > > > > 20KIP-219+page:1+mid:ooxabguy7nz7l7zy+state:results
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jiangjie (Becket) Qin
> > > > >
> > > >
> > >
> >
>

Reply via email to