Re: [DISCUSS] KIP-345: Reduce multiple consumer rebalances by specifying member id

Boyang Chen Tue, 06 Nov 2018 21:58:56 -0800

Thanks Matthias for bringing this awesome proposal up! I shall take a deeper 
look and make a comparison between the two proposals.



Meanwhile for the scale down specifically for stateful streaming, we could 
actually introduce a new status called "learner" where the newly up hosts could 
try to catch up with the assigned task progress first before triggering the 
rebalance, from which we don't see a sudden dip on the progress. However, it is 
built on top of the success of KIP-345.


________________________________
From: Matthias J. Sax <matth...@confluent.io>
Sent: Wednesday, November 7, 2018 7:02 AM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-345: Reduce multiple consumer rebalances by 
specifying member id

Hey,

there was quite a pause on this KIP discussion and in the mean time, a
new design for incremental cooporative rebalance was suggested:

https://cwiki.apache.org/confluence/display/KAFKA/Incremental+Cooperative+Rebalancing%3A+Support+and+Policies


We should make sure that the proposal and this KIP align to each other.
Thoughts?


-Matthias

On 11/5/18 7:31 PM, Boyang Chen wrote:
> Hey Mike,
>
>
> thanks for the feedback, the two question are very thoughtful!
>
>
>> 1) I am a little confused about the distinction for the leader. If the 
>> consumer node that was assigned leader does a bounce (goes down and quickly 
>> comes up) to update application code, will a rebalance be triggered? I > do 
>> not think a bounce of the leader should trigger a rebalance.
>
> For Q1 my intention was to minimize the change within one KIP, since the 
> leader rejoining case could be addressed separately.
>
>
>> 2) The timeout for shrink up makes a lot of sense and allows to gracefully 
>> increase the number of nodes in the cluster. I think we need to support 
>> graceful shrink down as well. If I set the registration timeout to 5 minutes 
>> > to handle rolling restarts or intermittent failures without shuffling 
>> state, I don't want to wait 5 minutes in order for the group to rebalance if 
>> I am intentionally removing a node from the cluster. I am not sure the best 
>> way to > do this. One idea I had was adding the ability for a CLI or Admin 
>> API to force a rebalance of the group. This would allow for an admin to 
>> trigger the rebalance manually without waiting the entire registration 
>> timeout on > shrink down. What do you think?
>
> For 2) my understanding is that for scaling down case it is better to be 
> addressed by CLI tool than code logic, since only by human evaluation we 
> could decide whether it is a "right timing" -- the time when all the scaling 
> down consumers are offline -- to kick in rebalance. Unless we introduce 
> another term on coordinator which indicates the target consumer group size, 
> broker will find it hard to decide when to start rebalance. So far I prefer 
> to hold the implementation for that, but agree we could discuss whether we 
> want to introduce admin API in this KIP or a separate one.
>
>
> Thanks again for the proposed ideas!
>
>
> Boyang
>
> ________________________________
> From: Mike Freyberger <mike.freyber...@xandr.com>
> Sent: Monday, November 5, 2018 6:13 AM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-345: Reduce multiple consumer rebalances by 
> specifying member id
>
> Boyang,
>
> Thanks for updating the KIP. It's shaping up well. Two things:
>
> 1) I am a little confused about the distinction for the leader. If the 
> consumer node that was assigned leader does a bounce (goes down and quickly 
> comes up) to update application code, will a rebalance be triggered? I do not 
> think a bounce of the leader should trigger a rebalance.
>
> 2) The timeout for shrink up makes a lot of sense and allows to gracefully 
> increase the number of nodes in the cluster. I think we need to support 
> graceful shrink down as well. If I set the registration timeout to 5 minutes 
> to handle rolling restarts or intermittent failures without shuffling state, 
> I don't want to wait 5 minutes in order for the group to rebalance if I am 
> intentionally removing a node from the cluster. I am not sure the best way to 
> do this. One idea I had was adding the ability for a CLI or Admin API to 
> force a rebalance of the group. This would allow for an admin to trigger the 
> rebalance manually without waiting the entire registration timeout on shrink 
> down. What do you think?
>
> Mike
>
> On 10/30/18, 1:55 AM, "Boyang Chen" <bche...@outlook.com> wrote:
>
>     Btw, I updated KIP 345 based on my understanding. Feel free to take 
> another round of look:
>
>     
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-345%3A+Introduce+static+membership+protocol+to+reduce+consumer+rebalances
> KIP-345: Introduce static membership protocol to reduce 
> ...<https://cwiki.apache.org/confluence/display/KAFKA/KIP-345%3A+Introduce+static+membership+protocol+to+reduce+consumer+rebalances>
> cwiki.apache.org
> For stateful applications, one of the biggest performance bottleneck is the 
> state shuffling. In Kafka consumer, there is a concept called "rebalance" 
> which means that for given M partitions and N consumers in one consumer 
> group, Kafka will try to balance the load between consumers and ideally have 
> ...
>
>
>
>
>     KIP-345: Introduce static membership protocol to reduce 
> ...<https://cwiki.apache.org/confluence/display/KAFKA/KIP-345%3A+Introduce+static+membership+protocol+to+reduce+consumer+rebalances>
>     cwiki.apache.org
>     For stateful applications, one of the biggest performance bottleneck is 
> the state shuffling. In Kafka consumer, there is a concept called "rebalance" 
> which means that for given M partitions and N consumers in one consumer 
> group, Kafka will try to balance the load between consumers and ideally have 
> ...
>
>
>
>
>
>     ________________________________
>     From: Boyang Chen <bche...@outlook.com>
>     Sent: Monday, October 29, 2018 12:34 PM
>     To: dev@kafka.apache.org
>     Subject: Re: [DISCUSS] KIP-345: Reduce multiple consumer rebalances by 
> specifying member id
>
>     Thanks everyone for the input on this thread! (Sorry it's been a while) I 
> feel that we are very close to the final solution.
>
>
>     Hey Jason and Mike, I have two quick questions on the new features here:
>
>       1.  so our proposal is that until we add a new static member into the 
> group (scale up), we will not trigger rebalance until the "registration 
> timeout"( the member has been offline for too long)? How about leader's 
> rejoin request, I think we should still trigger rebalance when that happens, 
> since the consumer group may have new topics to consume?
>       2.  I'm not very clear on the scale up scenario in static membership 
> here. Should we fallback to dynamic membership while adding/removing hosts 
> (by setting member.name = null), or we still want to add instances with 
> `member.name` so that we eventually expand/shrink the static membership? I 
> personally feel the easier solution is to spin up new members and wait until 
> either the same "registration timeout" or a "scale up timeout" before 
> starting the rebalance. What do you think?
>
>     Meanwhile I will go ahead to make changes to the KIP with our newly 
> discussed items and details. Really excited to see the design has become more 
> solid.
>
>     Best,
>     Boyang
>
>     ________________________________
>     From: Jason Gustafson <ja...@confluent.io>
>     Sent: Saturday, August 25, 2018 6:04 AM
>     To: dev
>     Subject: Re: [DISCUSS] KIP-345: Reduce multiple consumer rebalances by 
> specifying member id
>
>     Hey Mike,
>
>     Yeah, that's a good point. A long "registration timeout" may not be a 
> great
>     idea. Perhaps in practice you'd set it long enough to be able to detect a
>     failure and provision a new instance. Maybe on the order of 10 minutes is
>     more reasonable.
>
>     In any case, it's probably a good idea to have an administrative way to
>     force deregistration. One option is to extend the DeleteGroups API with a
>     list of members names.
>
>     -Jason
>
>
>
>     On Fri, Aug 24, 2018 at 2:21 PM, Mike Freyberger 
> <mfreyber...@appnexus.com>
>     wrote:
>
>     > Jason,
>     >
>     > Regarding step 4 in your proposal which suggests beginning a long timer
>     > (30 minutes) when a static member leaves the group, would there also be 
> the
>     > ability for an admin to force a static membership expiration?
>     >
>     > I’m thinking that during particular types of outages or upgrades users
>     > would want forcefully remove a static member from the group.
>     >
>     > So the user would shut the consumer down normally, which wouldn’t 
> trigger
>     > a rebalance. Then the user could use an admin CLI tool to force remove 
> that
>     > consumer from the group, so the TopicPartitions that were previously 
> owned
>     > by that consumer can be released.
>     >
>     > At a high level, we need consumer groups to gracefully handle 
> intermittent
>     > failures and permanent failures. Currently, the consumer group protocol
>     > handles permanent failures well, but does not handle intermittent 
> failures
>     > well (it creates unnecessary rebalances). I want to make sure the 
> overall
>     > solution here handles both intermittent failures and permanent failures,
>     > rather than sacrificing support for permanent failures in order to 
> provide
>     > support for intermittent failures.
>     >
>     > Mike
>     >
>     > Sent from my iPhone
>     >
>     > > On Aug 24, 2018, at 3:03 PM, Jason Gustafson <ja...@confluent.io> 
> wrote:
>     > >
>     > > Hey Guozhang,
>     > >
>     > > Responses below:
>     > >
>     > > Originally I was trying to kill more birds with one stone with 
> KIP-345,
>     > >> e.g. to fix the multi-rebalance issue on starting up / shutting down 
> a
>     > >> multi-instance client (mentioned as case 1)/2) in my early email), 
> and
>     > >> hence proposing to have a pure static-membership protocol. But 
> thinking
>     > >> twice about it I now feel it may be too ambitious and worth fixing in
>     > >> another KIP.
>     > >
>     > >
>     > > I was considering an extension to support pre-initialization of the
>     > static
>     > > members of the group, but I agree we should probably leave this 
> problem
>     > for
>     > > future work.
>     > >
>     > > 1. How this longish static member expiration timeout defined? Is it 
> via a
>     > >> broker, hence global config, or via a client config which can be
>     > >> communicated to broker via JoinGroupRequest?
>     > >
>     > >
>     > > I am not too sure. I tend to lean toward server-side configs because 
> they
>     > > are easier to evolve. If we have to add something to the protocol, 
> then
>     > > we'll be stuck with it forever.
>     > >
>     > > 2. Assuming that for static members, LEAVE_GROUP request will not
>     > trigger a
>     > >> rebalance immediately either, similar to session timeout, but only 
> the
>     > >> longer member expiration timeout, can we remove the internal "
>     > >> internal.leave.group.on.close" config, which is a quick walk-around
>     > then?
>     > >
>     > >
>     > > Yeah, I hope we can ultimately get rid of it, but we may need it for
>     > > compatibility with older brokers. A related question is what should be
>     > the
>     > > behavior of the consumer if `member.name` is provided but the broker
>     > does
>     > > not support it? We could either fail or silently downgrade to dynamic
>     > > membership.
>     > >
>     > > -Jason
>     > >
>     > >
>     > >> On Fri, Aug 24, 2018 at 11:44 AM, Guozhang Wang <wangg...@gmail.com>
>     > wrote:
>     > >>
>     > >> Hey Jason,
>     > >>
>     > >> I like your idea to simplify the upgrade protocol to allow co-exist 
> of
>     > >> static and dynamic members. Admittedly it may make the 
> coordinator-side
>     > >> logic a bit more complex, but I think it worth doing it.
>     > >>
>     > >> Originally I was trying to kill more birds with one stone with 
> KIP-345,
>     > >> e.g. to fix the multi-rebalance issue on starting up / shutting down 
> a
>     > >> multi-instance client (mentioned as case 1)/2) in my early email), 
> and
>     > >> hence proposing to have a pure static-membership protocol. But 
> thinking
>     > >> twice about it I now feel it may be too ambitious and worth fixing in
>     > >> another KIP. With that, I think what you've proposed here is a good 
> way
>     > to
>     > >> go for KIP-345 itself.
>     > >>
>     > >> Note there are a few details in your proposal we'd still need to 
> figure
>     > >> out:
>     > >>
>     > >> 1. How this longish static member expiration timeout defined? Is it 
> via
>     > a
>     > >> broker, hence global config, or via a client config which can be
>     > >> communicated to broker via JoinGroupRequest?
>     > >>
>     > >> 2. Assuming that for static members, LEAVE_GROUP request will not
>     > trigger a
>     > >> rebalance immediately either, similar to session timeout, but only 
> the
>     > >> longer member expiration timeout, can we remove the internal "
>     > >> internal.leave.group.on.close" config, which is a quick walk-around
>     > then?
>     > >>
>     > >>
>     > >>
>     > >> Guozhang
>     > >>
>     > >>
>     > >> On Fri, Aug 24, 2018 at 11:14 AM, Jason Gustafson 
> <ja...@confluent.io>
>     > >> wrote:
>     > >>
>     > >>> Hey All,
>     > >>>
>     > >>> Nice to see some solid progress on this. It sounds like one of the
>     > >>> complications is allowing static and dynamic registration to 
> coexist.
>     > I'm
>     > >>> wondering if we can do something like the following:
>     > >>>
>     > >>> 1. Statically registered members (those joining the group with a
>     > >> non-null `
>     > >>> member.name`) maintain a session with the coordinator just like
>     > dynamic
>     > >>> members.
>     > >>> 2. If a session is active for a static member when a rebalance 
> begins,
>     > >> then
>     > >>> basically we'll keep the current behavior. The rebalance will await 
> the
>     > >>> static member joining the group.
>     > >>> 3. If a static member does not have an active session, then the
>     > >> coordinator
>     > >>> will not wait for it to join, but will still include it in the
>     > rebalance.
>     > >>> The coordinator will forward the cached subscription information to 
> the
>     > >>> leader and will cache the assignment after the rebalance completes.
>     > (Note
>     > >>> that we still have the generationId to fence offset commits from a
>     > static
>     > >>> zombie if the assignment changes.)
>     > >>> 4. When a static member leaves the group or has its session expire, 
> no
>     > >>> rebalance is triggered. Instead, we can begin a timer to expire the
>     > >> static
>     > >>> registration. This would be a longish timeout (like 30 minutes say).
>     > >>>
>     > >>> So basically static members participate in all rebalances regardless
>     > >>> whether they have an active session. In a given rebalance, some of 
> the
>     > >>> members may be static and some dynamic. The group leader can
>     > >> differentiate
>     > >>> the two based on the presence of the `member.name` (we have to add
>     > this
>     > >> to
>     > >>> the JoinGroupResponse). Generally speaking, we would choose leaders
>     > >>> preferentially from the active members that support the latest
>     > JoinGroup
>     > >>> protocol and are using static membership. If we have to choose a 
> leader
>     > >>> with an old version, however, it would see all members in the group
>     > >> (static
>     > >>> or dynamic) as dynamic members and perform the assignment as usual.
>     > >>>
>     > >>> Would that work?
>     > >>>
>     > >>> -Jason
>     > >>>
>     > >>>
>     > >>> On Thu, Aug 23, 2018 at 5:26 PM, Guozhang Wang <wangg...@gmail.com>
>     > >> wrote:
>     > >>>
>     > >>>> Hello Boyang,
>     > >>>>
>     > >>>> Thanks for the updated proposal, a few questions:
>     > >>>>
>     > >>>> 1. Where will "change-group-timeout" be communicated to the broker?
>     > >> Will
>     > >>>> that be a new field in the JoinGroupRequest, or are we going to
>     > >>> piggy-back
>     > >>>> on the existing session-timeout field (assuming that the original
>     > value
>     > >>>> will not be used anywhere in the static membership any more)?
>     > >>>>
>     > >>>> 2. "However, if the consumer takes longer than session timeout to
>     > >> return,
>     > >>>> we shall still trigger rebalance but it could still try to catch
>     > >>>> `change-group-timeout`.": what does this mean? I thought your 
> proposal
>     > >> is
>     > >>>> that for static memberships, the broker will NOT trigger rebalance
>     > even
>     > >>>> after session-timeout has been detected, but only that after
>     > >>>> change-group-timeout
>     > >>>> which is supposed to be longer than session-timeout to be defined?
>     > >>>>
>     > >>>> 3. "A join group request with member.name set will be treated as
>     > >>>> `static-membership` strategy", in this case, how would the switch 
> from
>     > >>>> dynamic to static happen, since whoever changed the member.name to
>     > >>>> not-null
>     > >>>> will be rejected, right?
>     > >>>>
>     > >>>> 4. "just erase the cached mapping, and wait for session timeout to
>     > >>> trigger
>     > >>>> rebalance should be sufficient." this is also a bit unclear to me: 
> who
>     > >>> will
>     > >>>> erase the cached mapping? Since it is on the broker-side I assume 
> that
>     > >>>> broker has to do it. Are you suggesting to use a new request for 
> it?
>     > >>>>
>     > >>>> 5. "Halfway switch": following 3) above, if your proposal is 
> basically
>     > >> to
>     > >>>> let "first join-request wins", and the strategy will stay as is 
> until
>     > >> all
>     > >>>> members are gone, then this will also not happen since whoever used
>     > >>>> different strategy as the first guy who sends join-group request 
> will
>     > >> be
>     > >>>> rejected right?
>     > >>>>
>     > >>>>
>     > >>>> Guozhang
>     > >>>>
>     > >>>>
>     > >>>> On Tue, Aug 21, 2018 at 9:28 AM, John Roesler <j...@confluent.io>
>     > >> wrote:
>     > >>>>
>     > >>>>> This sounds good to me!
>     > >>>>>
>     > >>>>> Thanks for the time you've spent on it,
>     > >>>>> -John
>     > >>>>>
>     > >>>>> On Tue, Aug 21, 2018 at 12:13 AM Boyang Chen <bche...@outlook.com>
>     > >>>> wrote:
>     > >>>>>
>     > >>>>>> Thanks Matthias for the input. Sorry I was busy recently and
>     > >> haven't
>     > >>>> got
>     > >>>>>> time to update this thread. To summarize what we come up so far,
>     > >> here
>     > >>>> is
>     > >>>>> a
>     > >>>>>> draft updated plan:
>     > >>>>>>
>     > >>>>>>
>     > >>>>>> Introduce a new config called `member.name` which is supposed to
>     > >> be
>     > >>>>>> provided uniquely by the consumer client. The broker will 
> maintain
>     > >> a
>     > >>>>> cache
>     > >>>>>> with [key:member.name, value:member.id]. A join group request 
> with
>     > >>>>>> member.name set will be treated as `static-membership` strategy,
>     > >> and
>     > >>>>> will
>     > >>>>>> reject any join group request without member.name. So this
>     > >>>> coordination
>     > >>>>>> change will be differentiated from the `dynamic-membership`
>     > >> protocol
>     > >>> we
>     > >>>>>> currently have.
>     > >>>>>>
>     > >>>>>>
>     > >>>>>> When handling static join group request:
>     > >>>>>>
>     > >>>>>>  1.   The broker will check the membership to see whether this is
>     > >> a
>     > >>>> new
>     > >>>>>> member. If new, broker allocate a unique member id, cache the
>     > >> mapping
>     > >>>> and
>     > >>>>>> move to rebalance stage.
>     > >>>>>>  2.   Following 1, if this is an existing member, broker will not
>     > >>>> change
>     > >>>>>> group state, and return its cached member.id and current
>     > >> assignment.
>     > >>>>>> (unless this is leader, we shall trigger rebalance)
>     > >>>>>>  3.   Although Guozhang has mentioned we could rejoin with pair
>     > >>> member
>     > >>>>>> name and id, I think for join group request it is ok to leave
>     > >> member
>     > >>> id
>     > >>>>>> blank as member name is the unique identifier. In commit offset
>     > >>> request
>     > >>>>> we
>     > >>>>>> *must* have both.
>     > >>>>>>
>     > >>>>>>
>     > >>>>>> When handling commit offset request, if enabled with static
>     > >>> membership,
>     > >>>>>> each time the commit request must have both member.name and
>     > >>> member.id
>     > >>>> to
>     > >>>>>> be identified as a `certificated member`. If not, this means 
> there
>     > >>> are
>     > >>>>>> duplicate consumer members with same member name and the request
>     > >> will
>     > >>>> be
>     > >>>>>> rejected to guarantee consumption uniqueness.
>     > >>>>>>
>     > >>>>>>
>     > >>>>>> When rolling restart/shutting down gracefully, the client will
>     > >> send a
>     > >>>>>> leave group request (static membership mode). In static 
> membership,
>     > >>> we
>     > >>>>> will
>     > >>>>>> also define `change-group-timeout` to hold on rebalance provided 
> by
>     > >>>>> leader.
>     > >>>>>> So we will wait for all the members to rejoin the group and do
>     > >>> exactly
>     > >>>>> one
>     > >>>>>> rebalance since all members are expected to rejoin within 
> timeout.
>     > >> If
>     > >>>>>> consumer crashes, the join group request from the restarted
>     > >> consumer
>     > >>>> will
>     > >>>>>> be recognized as an existing member and be handled as above
>     > >> condition
>     > >>>> 1;
>     > >>>>>> However, if the consumer takes longer than session timeout to
>     > >> return,
>     > >>>> we
>     > >>>>>> shall still trigger rebalance but it could still try to catch
>     > >>>>>> `change-group-timeout`. If it failed to catch second timeout, its
>     > >>>> cached
>     > >>>>>> state on broker will be garbage collected and trigger a new
>     > >> rebalance
>     > >>>>> when
>     > >>>>>> it finally joins.
>     > >>>>>>
>     > >>>>>>
>     > >>>>>> And consider the switch between dynamic to static membership.
>     > >>>>>>
>     > >>>>>>  1.  Dynamic to static: the first joiner shall revise the
>     > >> membership
>     > >>>> to
>     > >>>>>> static and wait for all the current members to restart, since 
> their
>     > >>>>>> membership is still dynamic. Here our assumption is that the
>     > >> restart
>     > >>>>>> process shouldn't take a long time, as long restart is breaking 
> the
>     > >>>>>> `rebalance timeout` in whatever membership protocol we are using.
>     > >>>> Before
>     > >>>>>> restart, all dynamic member join requests will be rejected.
>     > >>>>>>  2.  Static to dynamic: this is more like a downgrade which 
> should
>     > >>> be
>     > >>>>>> smooth: just erase the cached mapping, and wait for session 
> timeout
>     > >>> to
>     > >>>>>> trigger rebalance should be sufficient. (Fallback to current
>     > >>> behavior)
>     > >>>>>>  3.  Halfway switch: a corner case is like some clients keep
>     > >> dynamic
>     > >>>>>> membership while some keep static membership. This will cause the
>     > >>> group
>     > >>>>>> rebalance forever without progress because dynamic/static states
>     > >> are
>     > >>>>>> bouncing each other. This could guarantee that we will not make 
> the
>     > >>>>>> consumer group work in a wrong state by having half static and 
> half
>     > >>>>> dynamic.
>     > >>>>>>
>     > >>>>>> To guarantee correctness, we will also push the member name/id 
> pair
>     > >>> to
>     > >>>>>> _consumed_offsets topic (as Matthias pointed out) and upgrade the
>     > >> API
>     > >>>>>> version, these details will be further discussed back in the KIP.
>     > >>>>>>
>     > >>>>>>
>     > >>>>>> Are there any concern for this high level proposal? Just want to
>     > >>>>> reiterate
>     > >>>>>> on the core idea of the KIP: "If the broker recognize this 
> consumer
>     > >>> as
>     > >>>> an
>     > >>>>>> existing member, it shouldn't trigger rebalance".
>     > >>>>>>
>     > >>>>>> Thanks a lot for everyone's input! I feel this proposal is much
>     > >> more
>     > >>>>>> robust than previous one!
>     > >>>>>>
>     > >>>>>>
>     > >>>>>> Best,
>     > >>>>>>
>     > >>>>>> Boyang
>     > >>>>>>
>     > >>>>>> ________________________________
>     > >>>>>> From: Matthias J. Sax <matth...@confluent.io>
>     > >>>>>> Sent: Friday, August 10, 2018 2:24 AM
>     > >>>>>> To: dev@kafka.apache.org
>     > >>>>>> Subject: Re: [DISCUSS] KIP-345: Reduce multiple consumer 
> rebalances
>     > >>> by
>     > >>>>>> specifying member id
>     > >>>>>>
>     > >>>>>> Hi,
>     > >>>>>>
>     > >>>>>> thanks for the detailed discussion. I learned a lot about 
> internals
>     > >>>> again
>     > >>>>>> :)
>     > >>>>>>
>     > >>>>>> I like the idea or a user config `member.name` and to keep `
>     > >>> member.id`
>     > >>>>>> internal. Also agree with Guozhang, that reusing `client.id` 
> might
>     > >>> not
>     > >>>>>> be a good idea.
>     > >>>>>>
>     > >>>>>> To clarify the algorithm, each time we generate a new 
> `member.id`,
>     > >>> we
>     > >>>>>> also need to update the "group membership" information (ie, 
> mapping
>     > >>>>>> [member.id, Assignment]), right? Ie, the new `member.id` replaces
>     > >>> the
>     > >>>>>> old entry in the cache.
>     > >>>>>>
>     > >>>>>> I also think, we need to preserve the `member.name -> member.id`
>     > >>>> mapping
>     > >>>>>> in the `__consumer_offset` topic. The KIP should mention this 
> IMHO.
>     > >>>>>>
>     > >>>>>> For changing the default value of config `leave.group.on.close`. 
> I
>     > >>>> agree
>     > >>>>>> with John, that we should not change the default config, because 
> it
>     > >>>>>> would impact all consumer groups with dynamic assignment. 
> However,
>     > >> I
>     > >>>>>> think we can document, that if static assignment is used (ie,
>     > >>>>>> `member.name` is configured) we never send a LeaveGroupRequest
>     > >>>>>> regardless of the config. Note, that the config is internal, so 
> not
>     > >>>> sure
>     > >>>>>> how to document this in detail. We should not expose the internal
>     > >>>> config
>     > >>>>>> in the docs.
>     > >>>>>>
>     > >>>>>> About upgrading: why do we need have two rolling bounces and 
> encode
>     > >>>>>> "static" vs "dynamic" in the JoinGroupRequest?
>     > >>>>>>
>     > >>>>>> If we upgrade an existing consumer group from dynamic to static, 
> I
>     > >>>> don't
>     > >>>>>> see any reason why both should not work together and single 
> rolling
>     > >>>>>> bounce would not be sufficient? If we bounce the first consumer 
> and
>     > >>>>>> switch from dynamic to static, it sends a `member.name` and the
>     > >>> broker
>     > >>>>>> registers the [member.name, member.id] in the cache. Why would
>     > >> this
>     > >>>>>> interfere with all other consumer that use dynamic assignment?
>     > >>>>>>
>     > >>>>>> Also, Guozhang mentioned that for all other request, we need to
>     > >> check
>     > >>>> if
>     > >>>>>> the mapping [member.name, member.id] contains the send 
> `member.id`
>     > >>> --
>     > >>>> I
>     > >>>>>> don't think this is necessary -- it seems to be sufficient to 
> check
>     > >>> the
>     > >>>>>> `member.id` from the [member.id, Assignment] mapping as be do
>     > >> today
>     > >>> --
>     > >>>>>> thus, checking `member.id` does not require any change IMHO.
>     > >>>>>>
>     > >>>>>>
>     > >>>>>> -Matthias
>     > >>>>>>
>     > >>>>>>
>     > >>>>>>> On 8/7/18 7:13 PM, Guozhang Wang wrote:
>     > >>>>>>> @James
>     > >>>>>>>
>     > >>>>>>> What you described is true: the transition from dynamic to 
> static
>     > >>>>>>> memberships are not thought through yet. But I do not think it 
> is
>     > >>> an
>     > >>>>>>> impossible problem: note that we indeed moved the offset commit
>     > >>> from
>     > >>>> ZK
>     > >>>>>> to
>     > >>>>>>> kafka coordinator in 0.8.2 :) The migration plan is to first to
>     > >>>>>>> double-commits on both zk and coordinator, and then do a second
>     > >>> round
>     > >>>>> to
>     > >>>>>>> turn the zk off.
>     > >>>>>>>
>     > >>>>>>> So just to throw a wild idea here: also following a
>     > >>>> two-rolling-bounce
>     > >>>>>>> manner, in the JoinGroupRequest we can set the flag to "static"
>     > >>> while
>     > >>>>>> keep
>     > >>>>>>> the registry-id field empty still, in this case, the coordinator
>     > >>>> still
>     > >>>>>>> follows the logic of "dynamic", accepting the request while
>     > >>> allowing
>     > >>>>> the
>     > >>>>>>> protocol to be set to "static"; after the first rolling bounce,
>     > >> the
>     > >>>>> group
>     > >>>>>>> protocol is already "static", then a second rolling bounce is
>     > >>>> triggered
>     > >>>>>> and
>     > >>>>>>> this time we set the registry-id.
>     > >>>>>>>
>     > >>>>>>>
>     > >>>>>>> Guozhang
>     > >>>>>>>
>     > >>>>>>> On Tue, Aug 7, 2018 at 1:19 AM, James Cheng <
>     > >> wushuja...@gmail.com>
>     > >>>>>> wrote:
>     > >>>>>>>
>     > >>>>>>>> Guozhang, in a previous message, you proposed said this:
>     > >>>>>>>>
>     > >>>>>>>>> On Jul 30, 2018, at 3:56 PM, Guozhang Wang <wangg...@gmail.com
>     > >>>
>     > >>>>> wrote:
>     > >>>>>>>>>
>     > >>>>>>>>> 1. We bump up the JoinGroupRequest with additional fields:
>     > >>>>>>>>>
>     > >>>>>>>>> 1.a) a flag indicating "static" or "dynamic" membership
>     > >>> protocols.
>     > >>>>>>>>> 1.b) with "static" membership, we also add the pre-defined
>     > >>> member
>     > >>>>> id.
>     > >>>>>>>>> 1.c) with "static" membership, we also add an optional
>     > >>>>>>>>> "group-change-timeout" value.
>     > >>>>>>>>>
>     > >>>>>>>>> 2. On the broker side, we enforce only one of the two 
> protocols
>     > >>> for
>     > >>>>> all
>     > >>>>>>>>> group members: we accept the protocol on the first joined
>     > >> member
>     > >>> of
>     > >>>>> the
>     > >>>>>>>>> group, and if later joining members indicate a different
>     > >>> membership
>     > >>>>>>>>> protocol, we reject it. If the group-change-timeout value was
>     > >>>>> different
>     > >>>>>>>> to
>     > >>>>>>>>> the first joined member, we reject it as well.
>     > >>>>>>>>
>     > >>>>>>>>
>     > >>>>>>>> What will happen if we have an already-deployed application 
> that
>     > >>>> wants
>     > >>>>>> to
>     > >>>>>>>> switch to using static membership? Let’s say there are 10
>     > >>> instances
>     > >>>> of
>     > >>>>>> it.
>     > >>>>>>>> As the instances go through a rolling restart, they will switch
>     > >>> from
>     > >>>>>>>> dynamic membership (the default?) to static membership. As each
>     > >>> one
>     > >>>>>> leaves
>     > >>>>>>>> the group and restarts, they will be rejected from the group
>     > >>>> (because
>     > >>>>>> the
>     > >>>>>>>> group is currently using dynamic membership). The group will
>     > >>> shrink
>     > >>>>> down
>     > >>>>>>>> until there is 1 node handling all the traffic. After that one
>     > >>>>> restarts,
>     > >>>>>>>> the group will switch over to static membership.
>     > >>>>>>>>
>     > >>>>>>>> Is that right? That means that the transition plan from dynamic
>     > >> to
>     > >>>>>> static
>     > >>>>>>>> membership isn’t very smooth.
>     > >>>>>>>>
>     > >>>>>>>> I’m not really sure what can be done in this case. This reminds
>     > >> me
>     > >>>> of
>     > >>>>>> the
>     > >>>>>>>> transition plans that were discussed for moving from
>     > >>> zookeeper-based
>     > >>>>>>>> consumers to kafka-coordinator-based consumers. That was also
>     > >>> hard,
>     > >>>>> and
>     > >>>>>>>> ultimately we decided not to build that.
>     > >>>>>>>>
>     > >>>>>>>> -James
>     > >>>>>>>>
>     > >>>>>>>>
>     > >>>>>>>
>     > >>>>>>>
>     > >>>>>>
>     > >>>>>>
>     > >>>>>
>     > >>>>
>     > >>>>
>     > >>>>
>     > >>>> --
>     > >>>> -- Guozhang
>     > >>>>
>     > >>>
>     > >>
>     > >>
>     > >>
>     > >> --
>     > >> -- Guozhang
>     > >>
>     >
>
>

Re: [DISCUSS] KIP-345: Reduce multiple consumer rebalances by specifying member id

Reply via email to