I think separating leave/join makes sense. The scenario I can think of for
delaying a rebalance on LeaveGroupRequest is rolling bounce of a service.
But that scenario could be tricky because there may be mixture of joining
and leaving. What happens if a consumer left the group right after another
consumer joins the group? Which delay should be applied?

Jason, if I understand correctly, the actual delay of the FIRST rebalance
for each group could be anywhere between group.initial.rebalance.delay.ms and
the rebalance timeout, depending on how many times the delay is applied.
For example, if the delay is set to 3 seconds and rebalance timeout is set
to 10 seconds. At time T a consumer joins the group, the targeting
rebalance point would be T+3 if no other consumer joins. If another
consumer joins the group at T+2 then the targeting delay point would become
T+5, etc. However, no matter how many times the delay was extended, at T+10
the rebalance will kick off even if at T+9 a new consumer joined the group.

I also agree that we should set the default delay to some meaningful value
instead of setting it to 0.

Thanks,

Jiangjie (Becket) Qin

On Tue, Mar 28, 2017 at 12:32 PM, Jason Gustafson <ja...@confluent.io>
wrote:

> Hey Damian,
>
> Thanks for the KIP. I think the proposal makes sense as a workaround maybe
> for some advanced users. However, I'm not sure we can depend on average
> users knowing that the config exists, much less setting it to something
> that makes sense. It's kind of a trend in streams that I'm not too thrilled
> about to try and control these rebalances through careful tuning of various
> timeouts. For example, the patch to avoid sending LeaveGroup depends on the
> session timeout being set at least as long as the time for an average
> rolling restart. If the expectation is that these settings are only needed
> for advanced users, it may be sufficient, but if the problems are affecting
> average users, it seems less than ideal. That said, if we can get some real
> benefit from low-hanging fruit like this, then it's probably worthwhile.
>
> This relates to the choice of default value, by the way. If we use 0 as the
> default, my guess is that most users won't change it and the benefit could
> be marginal. The choice of 3 seconds that you've documented seems fine to
> me. It matches the default consumer heartbeat interval, which controls
> typical rebalance latency, so there's some consistency there.
>
> Also, one minor comment: I guess the actual delay for each group will be
> the minimum of the group's rebalance timeout and
> group.initial.rebalance.delay.ms. Is that right?
>
> -Jason
>
> On Tue, Mar 28, 2017 at 8:29 AM, Damian Guy <damian....@gmail.com> wrote:
>
> > @Ismael - yeah sure we can reduce the default, though i'm not sure what
> the
> > "right" default would be.
> >
> > On Tue, 28 Mar 2017 at 15:40 Ismael Juma <ism...@juma.me.uk> wrote:
> >
> > > Is 3 seconds the right default if the timer gets reset after each
> > consumer
> > > joins? Maybe we can lower the default value given the new approach.
> > >
> > > Ismael
> > >
> > > On Tue, Mar 28, 2017 at 9:53 AM, Damian Guy <damian....@gmail.com>
> > wrote:
> > >
> > > > All,
> > > > I'd like to get this back to the original discussion about Delaying
> > > initial
> > > > consumer group rebalance.
> > > > I think i'm leaning towards sticking with the broker config and
> > changing
> > > > the delay so that the timer starts again when a new consumer joins
> the
> > > > group. What are peoples thoughts on that?
> > > >
> > > > Doing something similar on leave is valid, but i'd prefer to consider
> > it
> > > > separately from this.
> > > >
> > > > Thanks,
> > > > Damian
> > > >
> > > > On Tue, 28 Mar 2017 at 09:48 Damian Guy <damian....@gmail.com>
> wrote:
> > > >
> > > > > Matthias,
> > > > >
> > > > > Yes i know.
> > > > >
> > > > > Thanks,
> > > > > Damian
> > > > >
> > > > > On Mon, 27 Mar 2017 at 18:17 Matthias J. Sax <
> matth...@confluent.io>
> > > > > wrote:
> > > > >
> > > > > Damian,
> > > > >
> > > > > about "rebalance immediately" on timeout -- I guess, that's a
> > different
> > > > > case as no LeaveGroupRequest will be sent. Thus, the broker should
> be
> > > > > able to distinguish both cases easily, and apply the delay only if
> it
> > > > > received the LeaveGroupRequest but not if a consumer times out.
> > > > >
> > > > > Does this make sense?
> > > > >
> > > > > -Matthias
> > > > >
> > > > > On 3/27/17 1:56 AM, Damian Guy wrote:
> > > > > > @Becket
> > > > > >
> > > > > > Thanks for the feedback. Yes, i like the idea of extending the
> > delay
> > > as
> > > > > > each new consumer joins the group. Though, i think this could be
> > done
> > > > > with
> > > > > > either a consumer or broker side config. But i get your point
> that
> > > some
> > > > > > consumers in the group can be misconfigured.
> > > > > >
> > > > > > @Matthias & @Eno - yes we could probably do something similar if
> > the
> > > > > member
> > > > > > has sent the LeaveGroupRequest. I'm not sure it would be valid if
> > the
> > > > > > member crashed, hence session.timeout would come into play, we'd
> > > > probably
> > > > > > want to rebalance immediately. I'd be interested in hearing
> > thoughts
> > > > from
> > > > > > other core kafka folks on this one.
> > > > > >
> > > > > > Thanks,
> > > > > > Damian
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Fri, 24 Mar 2017 at 23:01 Becket Qin <becket....@gmail.com>
> > > wrote:
> > > > > >
> > > > > >> Hi Matthias,
> > > > > >>
> > > > > >> Yes, that was what I was thinking. We will keep delay it until
> > > either
> > > > > >> reaching the rebalance timeout or no new consumer joins in that
> > > small
> > > > > delay
> > > > > >> which is configured on the broker side.
> > > > > >>
> > > > > >> Thanks,
> > > > > >>
> > > > > >> Jiangjie (Becket) Qin
> > > > > >>
> > > > > >> On Fri, Mar 24, 2017 at 1:39 PM, Matthias J. Sax <
> > > > matth...@confluent.io
> > > > > >
> > > > > >> wrote:
> > > > > >>
> > > > > >>> @Becket:
> > > > > >>>
> > > > > >>> I am not sure, if I understand this correctly. Instead of
> > applying
> > > a
> > > > > >>> fixed delay, that starts when the first consumer of an (empty)
> > > group
> > > > > >>> joins, you suggest to re-trigger/re-set the delay each time a
> new
> > > > > >>> consumer joins?
> > > > > >>>
> > > > > >>> This sound like a good strategy to me, if the config is on the
> > > broker
> > > > > >> side.
> > > > > >>>
> > > > > >>> @Eno:
> > > > > >>>
> > > > > >>> I think that's a valid point and I like this idea!
> > > > > >>>
> > > > > >>>
> > > > > >>> -Matthias
> > > > > >>>
> > > > > >>>
> > > > > >>> On 3/24/17 1:23 PM, Eno Thereska wrote:
> > > > > >>>> Thanks Damian,
> > > > > >>>>
> > > > > >>>> This KIP deals with the initial phase only. What about the
> cases
> > > > when
> > > > > >>> several consumers leave a group? Won't there be several
> expensive
> > > > > >>> rebalances then as well? I'm wondering if it makes sense for
> the
> > > > delay
> > > > > to
> > > > > >>> hold anytime the "set" of consumers in a group changes, be it
> > > > addition
> > > > > to
> > > > > >>> the group or removal from group.
> > > > > >>>>
> > > > > >>>> Thanks
> > > > > >>>> Eno
> > > > > >>>>
> > > > > >>>>
> > > > > >>>>> On 24 Mar 2017, at 20:04, Becket Qin <becket....@gmail.com>
> > > wrote:
> > > > > >>>>>
> > > > > >>>>> Thanks for the KIP, Damian.
> > > > > >>>>>
> > > > > >>>>> My two cents on this. It seems there are two things worth
> > > thinking
> > > > > >> here:
> > > > > >>>>>
> > > > > >>>>> 1. Better rebalance timing. We will try to rebalance only
> when
> > > all
> > > > > the
> > > > > >>>>> consumers in a group have joined. The challenge would be
> > someone
> > > > has
> > > > > >> to
> > > > > >>>>> define what does ALL consumers mean, it could either be a
> time
> > or
> > > > > >>> number of
> > > > > >>>>> consumers, etc.
> > > > > >>>>>
> > > > > >>>>> 2. Avoid frequent rebalance. For example, if there are 100
> > > > consumers
> > > > > >> in
> > > > > >>> a
> > > > > >>>>> group, today, in the worst case, we may end up with 100
> > > rebalances
> > > > > >> even
> > > > > >>> if
> > > > > >>>>> all the consumers joined the group in a reasonably small
> amount
> > > of
> > > > > >> time.
> > > > > >>>>> Frequent rebalance is also a bad thing for brokers.
> > > > > >>>>>
> > > > > >>>>> Having a client side configuration may solve problem 1 better
> > > > because
> > > > > >>> each
> > > > > >>>>> consumer group can potentially configure their own timing.
> > > However,
> > > > > it
> > > > > >>> does
> > > > > >>>>> not really prevent frequent rebalance in general because some
> > of
> > > > the
> > > > > >>>>> consumers can be misconfigured. (This may have something to
> do
> > > with
> > > > > >>> KIP-124
> > > > > >>>>> as well. But if quota is applied on the JoinGroup/SyncGroup
> > > request
> > > > > it
> > > > > >>> may
> > > > > >>>>> cause some unwanted cascading effects.)
> > > > > >>>>>
> > > > > >>>>> Having a broker side configuration may result in less
> > flexibility
> > > > for
> > > > > >>> each
> > > > > >>>>> consumer group, but it can prevent frequent rebalance
> better. I
> > > > think
> > > > > >>> with
> > > > > >>>>> some reasonable design, the rebalance timing issue can be
> > > resolved
> > > > on
> > > > > >>> the
> > > > > >>>>> broker side as well. Matthias had a good point on extending
> the
> > > > delay
> > > > > >>> when
> > > > > >>>>> a new consumer joins a group (we actually did something
> similar
> > > to
> > > > > >> batch
> > > > > >>>>> ISR change propagation). For example, let's say on the broker
> > > side,
> > > > > we
> > > > > >>> will
> > > > > >>>>> always delay 2 seconds each time we see a new consumer
> joining
> > a
> > > > > >>> consumer
> > > > > >>>>> group. This would probably work for most of the consumer
> groups
> > > and
> > > > > >> will
> > > > > >>>>> also limit the rebalance frequency to protect the brokers.
> > > > > >>>>>
> > > > > >>>>> I am not sure about the streams use case here, but if
> something
> > > > like
> > > > > 2
> > > > > >>>>> seconds of delay is acceptable for streams, I would prefer
> > adding
> > > > the
> > > > > >>>>> configuration to the broker so that we can address both
> > problems.
> > > > > >>>>>
> > > > > >>>>> Thanks,
> > > > > >>>>>
> > > > > >>>>> Jiangjie (Becket) Qin
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>> On Fri, Mar 24, 2017 at 5:30 AM, Damian Guy <
> > > damian....@gmail.com>
> > > > > >>> wrote:
> > > > > >>>>>
> > > > > >>>>>> Thanks for the feedback.
> > > > > >>>>>>
> > > > > >>>>>> Ewen: I'm happy to make it a client side config. Other than
> > the
> > > > > >>> protocol
> > > > > >>>>>> bump i think the effort is almost the same. Personally i see
> > no
> > > > > other
> > > > > >>>>>> issues, but based on discussions with others this is what we
> > > came
> > > > up
> > > > > >>> with.
> > > > > >>>>>>
> > > > > >>>>>> True, it can probably be tested easily via an integration
> > test.
> > > > > >>>>>>
> > > > > >>>>>> Matthias: Yes i agree, the delay could be extended as each
> new
> > > > > member
> > > > > >>> joins
> > > > > >>>>>> the group.
> > > > > >>>>>>
> > > > > >>>>>> Thanks,
> > > > > >>>>>> Damian
> > > > > >>>>>>
> > > > > >>>>>> On Fri, 24 Mar 2017 at 05:14 Ewen Cheslack-Postava <
> > > > > >> e...@confluent.io>
> > > > > >>>>>> wrote:
> > > > > >>>>>>
> > > > > >>>>>>> I have the same initial response as Ismael re: broker vs
> > > consumer
> > > > > >>>>>> settings.
> > > > > >>>>>>> The global setting seems questionable.
> > > > > >>>>>>>
> > > > > >>>>>>> Could we maybe summarize what the impact of making this a
> > > client
> > > > > >>> config
> > > > > >>>>>>> would be? Protocol bump is obvious, but is there any other
> > > > > >> significant
> > > > > >>>>>>> issue? For the protocol bump in particular, I think this
> > change
> > > > is
> > > > > >>>>>>> currently really critical for streams; it will be valuable
> > > > > >> elsewhere,
> > > > > >>> but
> > > > > >>>>>>> the immediate demand is streams, so a protocol bump while
> > being
> > > > > >>> backwards
> > > > > >>>>>>> compatible wouldn't affect any other clients. Is this still
> > > > > actually
> > > > > >>>>>>> compatible with different clients given that they would now
> > > > expect
> > > > > >>>>>>> different timeouts? (I think it's strictly compatible if
> you
> > > wait
> > > > > >> for
> > > > > >>>>>>> responses, but if you enforce any client side timeouts, I'm
> > not
> > > > so
> > > > > >>> sure.)
> > > > > >>>>>>>
> > > > > >>>>>>> re: test plan, I'm sure this will come as a surprise, but
> is
> > > the
> > > > > >>> system
> > > > > >>>>>>> test even necessary? Validating # of rebalances seems messy
> > as
> > > > > other
> > > > > >>>>>> things
> > > > > >>>>>>> can cause rebalances (though admittedly not in a "clean"
> > case).
> > > > But
> > > > > >>>>>> really
> > > > > >>>>>>> it seems like an integration test could validate this by
> > making
> > > > > sure
> > > > > >>>>>> only 1
> > > > > >>>>>>> rebalance occurred when 2 members joined with a sufficient
> > time
> > > > > gap.
> > > > > >>>>>>>
> > > > > >>>>>>> -Ewen
> > > > > >>>>>>>
> > > > > >>>>>>> On Thu, Mar 23, 2017 at 3:53 PM, Matthias J. Sax <
> > > > > >>> matth...@confluent.io>
> > > > > >>>>>>> wrote:
> > > > > >>>>>>>
> > > > > >>>>>>>> Thanks for the KIP Damian!
> > > > > >>>>>>>>
> > > > > >>>>>>>> My two cents:
> > > > > >>>>>>>>
> > > > > >>>>>>>> - we should have an explicit parameter for this --
> implicit
> > > > > setting
> > > > > >>>>>> are
> > > > > >>>>>>>> always tricky (the "importance" of this parameter would be
> > > LOW)
> > > > > >>>>>>>>
> > > > > >>>>>>>> - the config should be different for each consumer group:
> > > > > >>>>>>>>   * assume you have a stateless app, you want to rebalance
> > > > > >>> immediately
> > > > > >>>>>>>>   * if you start-up in an visualized environment using
> some
> > > > tools
> > > > > >>> like
> > > > > >>>>>>>> Mesos you might need a different value that on bare metal
> > (no
> > > VM
> > > > > to
> > > > > >>> be
> > > > > >>>>>>>> started)
> > > > > >>>>>>>>   * it also depends, how many consumer instanced you
> expect
> > --
> > > > > it's
> > > > > >>>>>>>> harder to start up 100 instances in 3 seconds than 5
> > > > > >>>>>>>>
> > > > > >>>>>>>> - the default value should be zero
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>> One more thought: what about scaling scenarios? If a
> > consumer
> > > > > group
> > > > > >>> has
> > > > > >>>>>>>> 10 instanced and should be scaled up to 20, it would make
> > > sense
> > > > to
> > > > > >> do
> > > > > >>>>>>>> this with a single rebalance, too. Thus, I am wondering,
> if
> > it
> > > > > >> would
> > > > > >>>>>>>> make sense to apply this delay each time a new consumer
> > joins
> > > > > >> group,
> > > > > >>>>>>>> even if the group is not empty?
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>> -Matthias
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>> On 3/23/17 10:19 AM, Damian Guy wrote:
> > > > > >>>>>>>>> Thanks Gouzhang - i think another problem with this is
> that
> > > is
> > > > > >>>>>>>> overloading
> > > > > >>>>>>>>> session.timeout.ms to mean multiple things. I'm not sure
> > > that
> > > > is
> > > > > >> a
> > > > > >>>>>>> good
> > > > > >>>>>>>>> thing.
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> On Thu, 23 Mar 2017 at 17:14 Guozhang Wang <
> > > wangg...@gmail.com
> > > > >
> > > > > >>>>>> wrote:
> > > > > >>>>>>>>>
> > > > > >>>>>>>>>> The downside of it, though, is that although it "hides"
> > this
> > > > > from
> > > > > >>>>>> most
> > > > > >>>>>>>> of
> > > > > >>>>>>>>>> the users needing to be aware of it, by default session
> > > > timeout
> > > > > >>> i.e.
> > > > > >>>>>>> the
> > > > > >>>>>>>>>> rebalance timeout is 10 seconds which could arguably too
> > > long.
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> Guozhang
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> On Thu, Mar 23, 2017 at 10:12 AM, Guozhang Wang <
> > > > > >>> wangg...@gmail.com
> > > > > >>>>>>>
> > > > > >>>>>>>>>> wrote:
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>> Just throwing another alternative idea here: we can
> > > consider
> > > > > >> using
> > > > > >>>>>>> the
> > > > > >>>>>>>>>>> rebalance timeout value which is already included in
> the
> > > join
> > > > > >>>>>> request
> > > > > >>>>>>>>>>> protocol (and on the current Java client it is always
> > > written
> > > > > as
> > > > > >>>>>> the
> > > > > >>>>>>>>>>> session timeout value), that the first member joining
> > will
> > > > > >> always
> > > > > >>>>>>> force
> > > > > >>>>>>>>>> the
> > > > > >>>>>>>>>>> coordinator to wait that long. By doing this we do not
> > need
> > > > to
> > > > > >>> bump
> > > > > >>>>>>> up
> > > > > >>>>>>>>>> the
> > > > > >>>>>>>>>>> protocol either.
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> Guozhang
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> On Thu, Mar 23, 2017 at 5:49 AM, Damian Guy <
> > > > > >> damian....@gmail.com
> > > > > >>>>
> > > > > >>>>>>>>>> wrote:
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> Hi Ismael,
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> Mostly to avoid the protocol bump.
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> I agree that it may be difficult to choose the right
> > delay
> > > > for
> > > > > >>> all
> > > > > >>>>>>>>>>>> consumer
> > > > > >>>>>>>>>>>> groups, but we wanted to make this something that most
> > > users
> > > > > >>> don't
> > > > > >>>>>>>>>> really
> > > > > >>>>>>>>>>>> need to think about, i.e., a small enough default
> delay
> > > that
> > > > > >>> works
> > > > > >>>>>>> in
> > > > > >>>>>>>>>> the
> > > > > >>>>>>>>>>>> majority of cases. However it would be much more
> > flexible
> > > > as a
> > > > > >>>>>>>> consumer
> > > > > >>>>>>>>>>>> config, which i'm happy to pursue if this change is
> > worthy
> > > > of
> > > > > a
> > > > > >>>>>>>> protocol
> > > > > >>>>>>>>>>>> bump.
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> Thanks,
> > > > > >>>>>>>>>>>> Damian
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> On Thu, 23 Mar 2017 at 12:35 Ismael Juma <
> > > ism...@juma.me.uk
> > > > >
> > > > > >>>>>> wrote:
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>>> Thanks for the KIP, Damian. It makes sense to avoid
> > > > multiple
> > > > > >>>>>>>>>> rebalances
> > > > > >>>>>>>>>>>>> during start-up. One issue with having this as a
> broker
> > > > > config
> > > > > >>> is
> > > > > >>>>>>>> that
> > > > > >>>>>>>>>>>> it
> > > > > >>>>>>>>>>>>> may be difficult to choose the right delay for all
> > > consumer
> > > > > >>>>>> groups.
> > > > > >>>>>>>>>> Can
> > > > > >>>>>>>>>>>> you
> > > > > >>>>>>>>>>>>> elaborate a little more on why the first alternative
> > > (add a
> > > > > >>>>>>> consumer
> > > > > >>>>>>>>>>>>> config) was rejected? We bump protocol versions
> > regularly
> > > > > >> (when
> > > > > >>>>>> it
> > > > > >>>>>>>>>> makes
> > > > > >>>>>>>>>>>>> sense), so it would be good to get a bit more detail.
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>> Thanks,
> > > > > >>>>>>>>>>>>> Ismael
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>> On Thu, Mar 23, 2017 at 12:24 PM, Damian Guy <
> > > > > >>>>>> damian....@gmail.com
> > > > > >>>>>>>>
> > > > > >>>>>>>>>>>> wrote:
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>> Hi All,
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>> I've prepared a KIP to add a configurable delay to
> the
> > > > > >> initial
> > > > > >>>>>>>>>>>> consumer
> > > > > >>>>>>>>>>>>>> group rebalance.
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>> Please have look here:
> > > > > >>>>>>>>>>>>>> https://cwiki.apache.org/
> > confluence/display/KAFKA/KIP-
> > > > > >>>>>>>>>>>>>> 134%3A+Delay+initial+consumer+group+rebalance
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>> Thanks,
> > > > > >>>>>>>>>>>>>> Damian
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>> BTW, i apologize if this appears twice. Seems the
> > first
> > > > one
> > > > > >> may
> > > > > >>>>>>> have
> > > > > >>>>>>>>>>>> not
> > > > > >>>>>>>>>>>>>> made it.
> > > > > >>>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> --
> > > > > >>>>>>>>>>> -- Guozhang
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> --
> > > > > >>>>>>>>>> -- Guozhang
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>
> > > > > >>>>
> > > > > >>>
> > > > > >>>
> > > > > >>
> > > > > >
> > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to