Re: [Discuss] KIP-389: Enforce group.max.size to cap member metadata growth

Stanislav Kozlovski Sun, 30 Dec 2018 12:51:51 -0800

Thanks Boyang,

If there aren't any more thoughts on the KIP I'll start a vote thread in
the new year


On Sat, Dec 29, 2018 at 12:58 AM Boyang Chen <bche...@outlook.com> wrote:

> Yep Stanislav, that's what I'm proposing, and your explanation makes sense.
>
> Boyang
>
> ________________________________
> From: Stanislav Kozlovski <stanis...@confluent.io>
> Sent: Friday, December 28, 2018 7:59 PM
> To: dev@kafka.apache.org
> Subject: Re: [Discuss] KIP-389: Enforce group.max.size to cap member
> metadata growth
>
> Hey there everybody, let's work on wrapping this discussion up.
>
> @Boyang, could you clarify what you mean by
> > One more question is whether you feel we should enforce group size cap
> statically or on runtime?
> Is that related to the option of enabling this config via the dynamic
> broker config feature?
>
> Regarding that - I feel it's useful to have and I also think it might not
> introduce additional complexity. Ås long as we handle the config being
> changed midway through a rebalance (via using the old value) we should be
> good to go.
>
> On Wed, Dec 12, 2018 at 4:12 PM Stanislav Kozlovski <
> stanis...@confluent.io>
> wrote:
>
> > Hey Jason,
> >
> > Yes, that is what I meant by
> > > Given those constraints, I think that we can simply mark the group as
> > `PreparingRebalance` with a rebalanceTimeout of the server setting `
> > group.max.session.timeout.ms`. That's a bit long by default (5 minutes)
> > but I can't seem to come up with a better alternative
> > So either the timeout or all members calling joinGroup, yes
> >
> >
> > On Tue, Dec 11, 2018 at 8:14 PM Boyang Chen <bche...@outlook.com> wrote:
> >
> >> Hey Jason,
> >>
> >> I think this is the correct understanding. One more question is whether
> >> you feel
> >> we should enforce group size cap statically or on runtime?
> >>
> >> Boyang
> >> ________________________________
> >> From: Jason Gustafson <ja...@confluent.io>
> >> Sent: Tuesday, December 11, 2018 3:24 AM
> >> To: dev
> >> Subject: Re: [Discuss] KIP-389: Enforce group.max.size to cap member
> >> metadata growth
> >>
> >> Hey Stanislav,
> >>
> >> Just to clarify, I think what you're suggesting is something like this
> in
> >> order to gracefully shrink the group:
> >>
> >> 1. Transition the group to PREPARING_REBALANCE. No members are kicked
> out.
> >> 2. Continue to allow offset commits and heartbeats for all current
> >> members.
> >> 3. Allow the first n members that send JoinGroup to stay in the group,
> but
> >> wait for the JoinGroup (or session timeout) from all active members
> before
> >> finishing the rebalance.
> >>
> >> So basically we try to give the current members an opportunity to finish
> >> work, but we prevent some of them from rejoining after the rebalance
> >> completes. It sounds reasonable if I've understood correctly.
> >>
> >> Thanks,
> >> Jason
> >>
> >>
> >>
> >> On Fri, Dec 7, 2018 at 6:47 AM Boyang Chen <bche...@outlook.com> wrote:
> >>
> >> > Yep, LGTM on my side. Thanks Stanislav!
> >> > ________________________________
> >> > From: Stanislav Kozlovski <stanis...@confluent.io>
> >> > Sent: Friday, December 7, 2018 8:51 PM
> >> > To: dev@kafka.apache.org
> >> > Subject: Re: [Discuss] KIP-389: Enforce group.max.size to cap member
> >> > metadata growth
> >> >
> >> > Hi,
> >> >
> >> > We discussed this offline with Boyang and figured that it's best to
> not
> >> > wait on the Cooperative Rebalancing proposal. Our thinking is that we
> >> can
> >> > just force a rebalance from the broker, allowing consumers to commit
> >> > offsets if their rebalanceListener is configured correctly.
> >> > When rebalancing improvements are implemented, we assume that they
> would
> >> > improve KIP-389's behavior as well as the normal rebalance scenarios
> >> >
> >> > On Wed, Dec 5, 2018 at 12:09 PM Boyang Chen <bche...@outlook.com>
> >> wrote:
> >> >
> >> > > Hey Stanislav,
> >> > >
> >> > > thanks for the question! `Trivial rebalance` means "we don't start
> >> > > reassignment right now, but you need to know it's coming soon
> >> > > and you should start preparation".
> >> > >
> >> > > An example KStream use case is that before actually starting to
> shrink
> >> > the
> >> > > consumer group, we need to
> >> > > 1. partition the consumer group into two subgroups, where one will
> be
> >> > > offline soon and the other will keep serving;
> >> > > 2. make sure the states associated with near-future offline
> consumers
> >> are
> >> > > successfully replicated on the serving ones.
> >> > >
> >> > > As I have mentioned shrinking the consumer group is pretty much
> >> > equivalent
> >> > > to group scaling down, so we could think of this
> >> > > as an add-on use case for cluster scaling. So my understanding is
> that
> >> > the
> >> > > KIP-389 could be sequenced within our cooperative rebalancing<
> >> > >
> >> >
> >>
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FIncremental%2BCooperative%2BRebalancing%253A%2BSupport%2Band%2BPolicies&amp;data=02%7C01%7C%7Cb603e099d6c744d8fac708d65ed51d03%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636800666735874264&amp;sdata=BX4DHEX1OMgfVuBOREwSjiITu5aV83Q7NAz77w4avVc%3D&amp;reserved=0
> >> > > >
> >> > > proposal.
> >> > >
> >> > > Let me know if this makes sense.
> >> > >
> >> > > Best,
> >> > > Boyang
> >> > > ________________________________
> >> > > From: Stanislav Kozlovski <stanis...@confluent.io>
> >> > > Sent: Wednesday, December 5, 2018 5:52 PM
> >> > > To: dev@kafka.apache.org
> >> > > Subject: Re: [Discuss] KIP-389: Enforce group.max.size to cap member
> >> > > metadata growth
> >> > >
> >> > > Hey Boyang,
> >> > >
> >> > > I think we still need to take care of group shrinkage because even
> if
> >> > users
> >> > > change the config value we cannot guarantee that all consumer groups
> >> > would
> >> > > have been manually shrunk.
> >> > >
> >> > > Regarding 2., I agree that forcefully triggering a rebalance might
> be
> >> the
> >> > > most intuitive way to handle the situation.
> >> > > What does a "trivial rebalance" mean? Sorry, I'm not familiar with
> the
> >> > > term.
> >> > > I was thinking that maybe we could force a rebalance, which would
> >> cause
> >> > > consumers to commit their offsets (given their rebalanceListener is
> >> > > configured correctly) and subsequently reject some of the incoming
> >> > > `joinGroup` requests. Does that sound like it would work?
> >> > >
> >> > > On Wed, Dec 5, 2018 at 1:13 AM Boyang Chen <bche...@outlook.com>
> >> wrote:
> >> > >
> >> > > > Hey Stanislav,
> >> > > >
> >> > > > I read the latest KIP and saw that we already changed the default
> >> value
> >> > > to
> >> > > > -1. Do
> >> > > > we still need to take care of the consumer group shrinking when
> >> doing
> >> > the
> >> > > > upgrade?
> >> > > >
> >> > > > However this is an interesting topic that worth discussing.
> Although
> >> > > > rolling
> >> > > > upgrade is fine, `consumer.group.max.size` could always have
> >> conflict
> >> > > with
> >> > > > the current
> >> > > > consumer group size which means we need to adhere to one source of
> >> > truth.
> >> > > >
> >> > > > 1.Choose the current group size, which means we never interrupt
> the
> >> > > > consumer group until
> >> > > > it transits to PREPARE_REBALANCE. And we keep track of how many
> join
> >> > > group
> >> > > > requests
> >> > > > we have seen so far during PREPARE_REBALANCE. After reaching the
> >> > consumer
> >> > > > cap,
> >> > > > we start to inform over provisioned consumers that you should send
> >> > > > LeaveGroupRequest and
> >> > > > fail yourself. Or with what Mayuresh proposed in KIP-345, we could
> >> mark
> >> > > > extra members
> >> > > > as hot backup and rebalance without them.
> >> > > >
> >> > > > 2.Choose the `consumer.group.max.size`. I feel incremental
> >> rebalancing
> >> > > > (you proposed) could be of help here.
> >> > > > When a new cap is enforced, leader should be notified. If the
> >> current
> >> > > > group size is already over limit, leader
> >> > > > shall trigger a trivial rebalance to shuffle some topic partitions
> >> and
> >> > > let
> >> > > > a subset of consumers prepare the ownership
> >> > > > transition. Until they are ready, we trigger a real rebalance to
> >> remove
> >> > > > over-provisioned consumers. It is pretty much
> >> > > > equivalent to `how do we scale down the consumer group without
> >> > > > interrupting the current processing`.
> >> > > >
> >> > > > I personally feel inclined to 2 because we could kill two birds
> with
> >> > one
> >> > > > stone in a generic way. What do you think?
> >> > > >
> >> > > > Boyang
> >> > > > ________________________________
> >> > > > From: Stanislav Kozlovski <stanis...@confluent.io>
> >> > > > Sent: Monday, December 3, 2018 8:35 PM
> >> > > > To: dev@kafka.apache.org
> >> > > > Subject: Re: [Discuss] KIP-389: Enforce group.max.size to cap
> member
> >> > > > metadata growth
> >> > > >
> >> > > > Hi Jason,
> >> > > >
> >> > > > > 2. Do you think we should make this a dynamic config?
> >> > > > I'm not sure. Looking at the config from the perspective of a
> >> > > prescriptive
> >> > > > config, we may get away with not updating it dynamically.
> >> > > > But in my opinion, it always makes sense to have a config be
> >> > dynamically
> >> > > > configurable. As long as we limit it to being a cluster-wide
> >> config, we
> >> > > > should be fine.
> >> > > >
> >> > > > > 1. I think it would be helpful to clarify the details on how the
> >> > > > coordinator will shrink the group. It will need to choose which
> >> members
> >> > > to
> >> > > > remove. Are we going to give current members an opportunity to
> >> commit
> >> > > > offsets before kicking them from the group?
> >> > > >
> >> > > > This turns out to be somewhat tricky. I think that we may not be
> >> able
> >> > to
> >> > > > guarantee that consumers don't process a message twice.
> >> > > > My initial approach was to do as much as we could to let consumers
> >> > commit
> >> > > > offsets.
> >> > > >
> >> > > > I was thinking that we mark a group to be shrunk, we could keep a
> >> map
> >> > of
> >> > > > consumer_id->boolean indicating whether they have committed
> >> offsets. I
> >> > > then
> >> > > > thought we could delay the rebalance until every consumer commits
> >> (or
> >> > > some
> >> > > > time passes).
> >> > > > In the meantime, we would block all incoming fetch calls (by
> either
> >> > > > returning empty records or a retriable error) and we would
> continue
> >> to
> >> > > > accept offset commits (even twice for a single consumer)
> >> > > >
> >> > > > I see two problems with this approach:
> >> > > > * We have async offset commits, which implies that we can receive
> >> fetch
> >> > > > requests before the offset commit req has been handled. i.e
> consmer
> >> > sends
> >> > > > fetchReq A, offsetCommit B, fetchReq C - we may receive A,C,B in
> the
> >> > > > broker. Meaning we could have saved the offsets for B but
> rebalance
> >> > > before
> >> > > > the offsetCommit for the offsets processed in C come in.
> >> > > > * KIP-392 Allow consumers to fetch from closest replica
> >> > > > <
> >> > > >
> >> > >
> >> >
> >>
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-392%253A%2BAllow%2Bconsumers%2Bto%2Bfetch%2Bfrom%2Bclosest%2Breplica&amp;data=02%7C01%7C%7Cb603e099d6c744d8fac708d65ed51d03%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636800666735874264&amp;sdata=bekXj%2FVdA6flZWQ70%2BSEyHm31%2F2WyWO1EpbvqyjWFJw%3D&amp;reserved=0
> >> > > > >
> >> > > > would
> >> > > > make it significantly harder to block poll() calls on consumers
> >> whose
> >> > > > groups are being shrunk. Even if we implemented a solution, the
> same
> >> > race
> >> > > > condition noted above seems to apply and probably others
> >> > > >
> >> > > >
> >> > > > Given those constraints, I think that we can simply mark the group
> >> as
> >> > > > `PreparingRebalance` with a rebalanceTimeout of the server
> setting `
> >> > > > group.max.session.timeout.ms`. That's a bit long by default (5
> >> > minutes)
> >> > > > but
> >> > > > I can't seem to come up with a better alternative
> >> > > >
> >> > > > I'm interested in hearing your thoughts.
> >> > > >
> >> > > > Thanks,
> >> > > > Stanislav
> >> > > >
> >> > > > On Fri, Nov 30, 2018 at 8:38 AM Jason Gustafson <
> ja...@confluent.io
> >> >
> >> > > > wrote:
> >> > > >
> >> > > > > Hey Stanislav,
> >> > > > >
> >> > > > > What do you think about the use case I mentioned in my previous
> >> reply
> >> > > > about
> >> > > > > > a more resilient self-service Kafka? I believe the benefit
> >> there is
> >> > > > > bigger.
> >> > > > >
> >> > > > >
> >> > > > > I see this config as analogous to the open file limit. Probably
> >> this
> >> > > > limit
> >> > > > > was intended to be prescriptive at some point about what was
> >> deemed a
> >> > > > > reasonable number of open files for an application. But mostly
> >> people
> >> > > > treat
> >> > > > > it as an annoyance which they have to work around. If it happens
> >> to
> >> > be
> >> > > > hit,
> >> > > > > usually you just increase it because it is not tied to an actual
> >> > > resource
> >> > > > > constraint. However, occasionally hitting the limit does
> indicate
> >> an
> >> > > > > application bug such as a leak, so I wouldn't say it is useless.
> >> > > > Similarly,
> >> > > > > the issue in KAFKA-7610 was a consumer leak and having this
> limit
> >> > would
> >> > > > > have allowed the problem to be detected before it impacted the
> >> > cluster.
> >> > > > To
> >> > > > > me, that's the main benefit. It's possible that it could be used
> >> > > > > prescriptively to prevent poor usage of groups, but like the
> open
> >> > file
> >> > > > > limit, I suspect administrators will just set it large enough
> that
> >> > > users
> >> > > > > are unlikely to complain.
> >> > > > >
> >> > > > > Anyway, just a couple additional questions:
> >> > > > >
> >> > > > > 1. I think it would be helpful to clarify the details on how the
> >> > > > > coordinator will shrink the group. It will need to choose which
> >> > members
> >> > > > to
> >> > > > > remove. Are we going to give current members an opportunity to
> >> commit
> >> > > > > offsets before kicking them from the group?
> >> > > > >
> >> > > > > 2. Do you think we should make this a dynamic config?
> >> > > > >
> >> > > > > Thanks,
> >> > > > > Jason
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > On Wed, Nov 28, 2018 at 2:42 AM Stanislav Kozlovski <
> >> > > > > stanis...@confluent.io>
> >> > > > > wrote:
> >> > > > >
> >> > > > > > Hi Jason,
> >> > > > > >
> >> > > > > > You raise some very valid points.
> >> > > > > >
> >> > > > > > > The benefit of this KIP is probably limited to preventing
> >> > "runaway"
> >> > > > > > consumer groups due to leaks or some other application bug
> >> > > > > > What do you think about the use case I mentioned in my
> previous
> >> > reply
> >> > > > > about
> >> > > > > > a more resilient self-service Kafka? I believe the benefit
> >> there is
> >> > > > > bigger
> >> > > > > >
> >> > > > > > * Default value
> >> > > > > > You're right, we probably do need to be conservative. Big
> >> consumer
> >> > > > groups
> >> > > > > > are considered an anti-pattern and my goal was to also hint at
> >> this
> >> > > > > through
> >> > > > > > the config's default. Regardless, it is better to not have the
> >> > > > potential
> >> > > > > to
> >> > > > > > break applications with an upgrade.
> >> > > > > > Choosing between the default of something big like 5000 or an
> >> > opt-in
> >> > > > > > option, I think we should go with the *disabled default
> option*
> >> > > (-1).
> >> > > > > > The only benefit we would get from a big default of 5000 is
> >> default
> >> > > > > > protection against buggy/malicious applications that hit the
> >> > > KAFKA-7610
> >> > > > > > issue.
> >> > > > > > While this KIP was spawned from that issue, I believe its
> value
> >> is
> >> > > > > enabling
> >> > > > > > the possibility of protection and helping move towards a more
> >> > > > > self-service
> >> > > > > > Kafka. I also think that a default value of 5000 might be
> >> > misleading
> >> > > to
> >> > > > > > users and lead them to think that big consumer groups (> 250)
> >> are a
> >> > > > good
> >> > > > > > thing.
> >> > > > > >
> >> > > > > > The good news is that KAFKA-7610 should be fully resolved and
> >> the
> >> > > > > rebalance
> >> > > > > > protocol should, in general, be more solid after the planned
> >> > > > improvements
> >> > > > > > in KIP-345 and KIP-394.
> >> > > > > >
> >> > > > > > * Handling bigger groups during upgrade
> >> > > > > > I now see that we store the state of consumer groups in the
> log
> >> and
> >> > > > why a
> >> > > > > > rebalance isn't expected during a rolling upgrade.
> >> > > > > > Since we're going with the default value of the max.size being
> >> > > > disabled,
> >> > > > > I
> >> > > > > > believe we can afford to be more strict here.
> >> > > > > > During state reloading of a new Coordinator with a defined
> >> > > > max.group.size
> >> > > > > > config, I believe we should *force* rebalances for groups that
> >> > exceed
> >> > > > the
> >> > > > > > configured size. Then, only some consumers will be able to
> join
> >> and
> >> > > the
> >> > > > > max
> >> > > > > > size invariant will be satisfied.
> >> > > > > >
> >> > > > > > I updated the KIP with a migration plan, rejected alternatives
> >> and
> >> > > the
> >> > > > > new
> >> > > > > > default value.
> >> > > > > >
> >> > > > > > Thanks,
> >> > > > > > Stanislav
> >> > > > > >
> >> > > > > > On Tue, Nov 27, 2018 at 5:25 PM Jason Gustafson <
> >> > ja...@confluent.io>
> >> > > > > > wrote:
> >> > > > > >
> >> > > > > > > Hey Stanislav,
> >> > > > > > >
> >> > > > > > > Clients will then find that coordinator
> >> > > > > > > > and send `joinGroup` on it, effectively rebuilding the
> >> group,
> >> > > since
> >> > > > > the
> >> > > > > > > > cache of active consumers is not stored outside the
> >> > Coordinator's
> >> > > > > > memory.
> >> > > > > > > > (please do say if that is incorrect)
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > Groups do not typically rebalance after a coordinator
> change.
> >> You
> >> > > > could
> >> > > > > > > potentially force a rebalance if the group is too big and
> kick
> >> > out
> >> > > > the
> >> > > > > > > slowest members or something. A more graceful solution is
> >> > probably
> >> > > to
> >> > > > > > just
> >> > > > > > > accept the current size and prevent it from getting bigger.
> We
> >> > > could
> >> > > > > log
> >> > > > > > a
> >> > > > > > > warning potentially.
> >> > > > > > >
> >> > > > > > > My thinking is that we should abstract away from conserving
> >> > > resources
> >> > > > > and
> >> > > > > > > > focus on giving control to the broker. The issue that
> >> spawned
> >> > > this
> >> > > > > KIP
> >> > > > > > > was
> >> > > > > > > > a memory problem but I feel this change is useful in a
> more
> >> > > general
> >> > > > > > way.
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > So you probably already know why I'm asking about this. For
> >> > > consumer
> >> > > > > > groups
> >> > > > > > > anyway, resource usage would typically be proportional to
> the
> >> > > number
> >> > > > of
> >> > > > > > > partitions that a group is reading from and not the number
> of
> >> > > > members.
> >> > > > > > For
> >> > > > > > > example, consider the memory use in the offsets cache. The
> >> > benefit
> >> > > of
> >> > > > > > this
> >> > > > > > > KIP is probably limited to preventing "runaway" consumer
> >> groups
> >> > due
> >> > > > to
> >> > > > > > > leaks or some other application bug. That still seems useful
> >> > > though.
> >> > > > > > >
> >> > > > > > > I completely agree with this and I *ask everybody to chime
> in
> >> > with
> >> > > > > > opinions
> >> > > > > > > > on a sensible default value*.
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > I think we would have to be very conservative. The group
> >> protocol
> >> > > is
> >> > > > > > > generic in some sense, so there may be use cases we don't
> >> know of
> >> > > > where
> >> > > > > > > larger groups are reasonable. Probably we should make this
> an
> >> > > opt-in
> >> > > > > > > feature so that we do not risk breaking anyone's application
> >> > after
> >> > > an
> >> > > > > > > upgrade. Either that, or use a very high default like 5,000.
> >> > > > > > >
> >> > > > > > > Thanks,
> >> > > > > > > Jason
> >> > > > > > >
> >> > > > > > > On Tue, Nov 27, 2018 at 3:27 AM Stanislav Kozlovski <
> >> > > > > > > stanis...@confluent.io>
> >> > > > > > > wrote:
> >> > > > > > >
> >> > > > > > > > Hey Jason and Boyang, those were important comments
> >> > > > > > > >
> >> > > > > > > > > One suggestion I have is that it would be helpful to put
> >> your
> >> > > > > > reasoning
> >> > > > > > > > on deciding the current default value. For example, in
> >> certain
> >> > > use
> >> > > > > > cases
> >> > > > > > > at
> >> > > > > > > > Pinterest we are very likely to have more consumers than
> 250
> >> > when
> >> > > > we
> >> > > > > > > > configure 8 stream instances with 32 threads.
> >> > > > > > > > > For the effectiveness of this KIP, we should encourage
> >> people
> >> > > to
> >> > > > > > > discuss
> >> > > > > > > > their opinions on the default setting and ideally reach a
> >> > > > consensus.
> >> > > > > > > >
> >> > > > > > > > I completely agree with this and I *ask everybody to chime
> >> in
> >> > > with
> >> > > > > > > opinions
> >> > > > > > > > on a sensible default value*.
> >> > > > > > > > My thought process was that in the current model
> rebalances
> >> in
> >> > > > large
> >> > > > > > > groups
> >> > > > > > > > are more costly. I imagine most use cases in most Kafka
> >> users
> >> > do
> >> > > > not
> >> > > > > > > > require more than 250 consumers.
> >> > > > > > > > Boyang, you say that you are "likely to have... when
> we..."
> >> -
> >> > do
> >> > > > you
> >> > > > > > have
> >> > > > > > > > systems running with so many consumers in a group or are
> you
> >> > > > planning
> >> > > > > > > to? I
> >> > > > > > > > guess what I'm asking is whether this has been tested in
> >> > > production
> >> > > > > > with
> >> > > > > > > > the current rebalance model (ignoring KIP-345)
> >> > > > > > > >
> >> > > > > > > > >  Can you clarify the compatibility impact here? What
> >> > > > > > > > > will happen to groups that are already larger than the
> max
> >> > > size?
> >> > > > > > > > This is a very important question.
> >> > > > > > > > From my current understanding, when a coordinator broker
> >> gets
> >> > > shut
> >> > > > > > > > down during a cluster rolling upgrade, a replica will take
> >> > > > leadership
> >> > > > > > of
> >> > > > > > > > the `__offset_commits` partition. Clients will then find
> >> that
> >> > > > > > coordinator
> >> > > > > > > > and send `joinGroup` on it, effectively rebuilding the
> >> group,
> >> > > since
> >> > > > > the
> >> > > > > > > > cache of active consumers is not stored outside the
> >> > Coordinator's
> >> > > > > > memory.
> >> > > > > > > > (please do say if that is incorrect)
> >> > > > > > > > Then, I believe that working as if this is a new group is
> a
> >> > > > > reasonable
> >> > > > > > > > approach. Namely, fail joinGroups when the max.size is
> >> > exceeded.
> >> > > > > > > > What do you guys think about this? (I'll update the KIP
> >> after
> >> > we
> >> > > > > settle
> >> > > > > > > on
> >> > > > > > > > a solution)
> >> > > > > > > >
> >> > > > > > > > >  Also, just to be clear, the resource we are trying to
> >> > conserve
> >> > > > > here
> >> > > > > > is
> >> > > > > > > > what? Memory?
> >> > > > > > > > My thinking is that we should abstract away from
> conserving
> >> > > > resources
> >> > > > > > and
> >> > > > > > > > focus on giving control to the broker. The issue that
> >> spawned
> >> > > this
> >> > > > > KIP
> >> > > > > > > was
> >> > > > > > > > a memory problem but I feel this change is useful in a
> more
> >> > > general
> >> > > > > > way.
> >> > > > > > > It
> >> > > > > > > > limits the control clients have on the cluster and helps
> >> Kafka
> >> > > > > become a
> >> > > > > > > > more self-serving system. Admin/Ops teams can better
> control
> >> > the
> >> > > > > impact
> >> > > > > > > > application developers can have on a Kafka cluster with
> this
> >> > > change
> >> > > > > > > >
> >> > > > > > > > Best,
> >> > > > > > > > Stanislav
> >> > > > > > > >
> >> > > > > > > >
> >> > > > > > > > On Mon, Nov 26, 2018 at 8:00 PM Jason Gustafson <
> >> > > > ja...@confluent.io>
> >> > > > > > > > wrote:
> >> > > > > > > >
> >> > > > > > > > > Hi Stanislav,
> >> > > > > > > > >
> >> > > > > > > > > Thanks for the KIP. Can you clarify the compatibility
> >> impact
> >> > > > here?
> >> > > > > > What
> >> > > > > > > > > will happen to groups that are already larger than the
> max
> >> > > size?
> >> > > > > > Also,
> >> > > > > > > > just
> >> > > > > > > > > to be clear, the resource we are trying to conserve here
> >> is
> >> > > what?
> >> > > > > > > Memory?
> >> > > > > > > > >
> >> > > > > > > > > -Jason
> >> > > > > > > > >
> >> > > > > > > > > On Mon, Nov 26, 2018 at 2:44 AM Boyang Chen <
> >> > > bche...@outlook.com
> >> > > > >
> >> > > > > > > wrote:
> >> > > > > > > > >
> >> > > > > > > > > > Thanks Stanislav for the update! One suggestion I have
> >> is
> >> > > that
> >> > > > it
> >> > > > > > > would
> >> > > > > > > > > be
> >> > > > > > > > > > helpful to put your
> >> > > > > > > > > >
> >> > > > > > > > > > reasoning on deciding the current default value. For
> >> > example,
> >> > > > in
> >> > > > > > > > certain
> >> > > > > > > > > > use cases at Pinterest we are very likely
> >> > > > > > > > > >
> >> > > > > > > > > > to have more consumers than 250 when we configure 8
> >> stream
> >> > > > > > instances
> >> > > > > > > > with
> >> > > > > > > > > > 32 threads.
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > > For the effectiveness of this KIP, we should encourage
> >> > people
> >> > > > to
> >> > > > > > > > discuss
> >> > > > > > > > > > their opinions on the default setting and ideally
> reach
> >> a
> >> > > > > > consensus.
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > > Best,
> >> > > > > > > > > >
> >> > > > > > > > > > Boyang
> >> > > > > > > > > >
> >> > > > > > > > > > ________________________________
> >> > > > > > > > > > From: Stanislav Kozlovski <stanis...@confluent.io>
> >> > > > > > > > > > Sent: Monday, November 26, 2018 6:14 PM
> >> > > > > > > > > > To: dev@kafka.apache.org
> >> > > > > > > > > > Subject: Re: [Discuss] KIP-389: Enforce group.max.size
> >> to
> >> > cap
> >> > > > > > member
> >> > > > > > > > > > metadata growth
> >> > > > > > > > > >
> >> > > > > > > > > > Hey everybody,
> >> > > > > > > > > >
> >> > > > > > > > > > It's been a week since this KIP and not much
> discussion
> >> has
> >> > > > been
> >> > > > > > > made.
> >> > > > > > > > > > I assume that this is a straight forward change and I
> >> will
> >> > > > open a
> >> > > > > > > > voting
> >> > > > > > > > > > thread in the next couple of days if nobody has
> >> anything to
> >> > > > > > suggest.
> >> > > > > > > > > >
> >> > > > > > > > > > Best,
> >> > > > > > > > > > Stanislav
> >> > > > > > > > > >
> >> > > > > > > > > > On Thu, Nov 22, 2018 at 12:56 PM Stanislav Kozlovski <
> >> > > > > > > > > > stanis...@confluent.io>
> >> > > > > > > > > > wrote:
> >> > > > > > > > > >
> >> > > > > > > > > > > Greetings everybody,
> >> > > > > > > > > > >
> >> > > > > > > > > > > I have enriched the KIP a bit with a bigger
> Motivation
> >> > > > section
> >> > > > > > and
> >> > > > > > > > also
> >> > > > > > > > > > > renamed it.
> >> > > > > > > > > > > KIP:
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-389%253A%2BIntroduce%2Ba%2Bconfigurable%2Bconsumer%2Bgroup%2Bsize%2Blimit&amp;data=02%7C01%7C%7Cb603e099d6c744d8fac708d65ed51d03%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636800666735874264&amp;sdata=dLVLofL8NnQatVq6WEDukxfIorh7HeQR9TyyUifcAPo%3D&amp;reserved=0
> >> > > > > > > > > > >
> >> > > > > > > > > > > I'm looking forward to discussions around it.
> >> > > > > > > > > > >
> >> > > > > > > > > > > Best,
> >> > > > > > > > > > > Stanislav
> >> > > > > > > > > > >
> >> > > > > > > > > > > On Tue, Nov 20, 2018 at 1:47 PM Stanislav Kozlovski
> <
> >> > > > > > > > > > > stanis...@confluent.io> wrote:
> >> > > > > > > > > > >
> >> > > > > > > > > > >> Hey there everybody,
> >> > > > > > > > > > >>
> >> > > > > > > > > > >> Thanks for the introduction Boyang. I appreciate
> the
> >> > > effort
> >> > > > > you
> >> > > > > > > are
> >> > > > > > > > > > >> putting into improving consumer behavior in Kafka.
> >> > > > > > > > > > >>
> >> > > > > > > > > > >> @Matt
> >> > > > > > > > > > >> I also believe the default value is high. In my
> >> opinion,
> >> > > we
> >> > > > > > should
> >> > > > > > > > aim
> >> > > > > > > > > > to
> >> > > > > > > > > > >> a default cap around 250. This is because in the
> >> current
> >> > > > model
> >> > > > > > any
> >> > > > > > > > > > consumer
> >> > > > > > > > > > >> rebalance is disrupting to every consumer. The
> bigger
> >> > the
> >> > > > > group,
> >> > > > > > > the
> >> > > > > > > > > > longer
> >> > > > > > > > > > >> this period of disruption.
> >> > > > > > > > > > >>
> >> > > > > > > > > > >> If you have such a large consumer group, chances
> are
> >> > that
> >> > > > your
> >> > > > > > > > > > >> client-side logic could be structured better and
> that
> >> > you
> >> > > > are
> >> > > > > > not
> >> > > > > > > > > using
> >> > > > > > > > > > the
> >> > > > > > > > > > >> high number of consumers to achieve high
> throughput.
> >> > > > > > > > > > >> 250 can still be considered of a high upper bound,
> I
> >> > > believe
> >> > > > > in
> >> > > > > > > > > practice
> >> > > > > > > > > > >> users should aim to not go over 100 consumers per
> >> > consumer
> >> > > > > > group.
> >> > > > > > > > > > >>
> >> > > > > > > > > > >> In regards to the cap being global/per-broker, I
> >> think
> >> > > that
> >> > > > we
> >> > > > > > > > should
> >> > > > > > > > > > >> consider whether we want it to be global or
> >> *per-topic*.
> >> > > For
> >> > > > > the
> >> > > > > > > > time
> >> > > > > > > > > > >> being, I believe that having it per-topic with a
> >> global
> >> > > > > default
> >> > > > > > > > might
> >> > > > > > > > > be
> >> > > > > > > > > > >> the best situation. Having it global only seems a
> bit
> >> > > > > > restricting
> >> > > > > > > to
> >> > > > > > > > > me
> >> > > > > > > > > > and
> >> > > > > > > > > > >> it never hurts to support more fine-grained
> >> > > configurability
> >> > > > > > (given
> >> > > > > > > > > it's
> >> > > > > > > > > > the
> >> > > > > > > > > > >> same config, not a new one being introduced).
> >> > > > > > > > > > >>
> >> > > > > > > > > > >> On Tue, Nov 20, 2018 at 11:32 AM Boyang Chen <
> >> > > > > > bche...@outlook.com
> >> > > > > > > >
> >> > > > > > > > > > wrote:
> >> > > > > > > > > > >>
> >> > > > > > > > > > >>> Thanks Matt for the suggestion! I'm still open to
> >> any
> >> > > > > > suggestion
> >> > > > > > > to
> >> > > > > > > > > > >>> change the default value. Meanwhile I just want to
> >> > point
> >> > > > out
> >> > > > > > that
> >> > > > > > > > > this
> >> > > > > > > > > > >>> value is a just last line of defense, not a real
> >> > scenario
> >> > > > we
> >> > > > > > > would
> >> > > > > > > > > > expect.
> >> > > > > > > > > > >>>
> >> > > > > > > > > > >>>
> >> > > > > > > > > > >>> In the meanwhile, I discussed with Stanislav and
> he
> >> > would
> >> > > > be
> >> > > > > > > > driving
> >> > > > > > > > > > the
> >> > > > > > > > > > >>> 389 effort from now on. Stanislav proposed the
> idea
> >> in
> >> > > the
> >> > > > > > first
> >> > > > > > > > > place
> >> > > > > > > > > > and
> >> > > > > > > > > > >>> had already come up a draft design, while I will
> >> keep
> >> > > > > focusing
> >> > > > > > on
> >> > > > > > > > > > KIP-345
> >> > > > > > > > > > >>> effort to ensure solving the edge case described
> in
> >> the
> >> > > > JIRA<
> >> > > > > > > > > > >>>
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FKAFKA-7610&amp;data=02%7C01%7C%7Cb603e099d6c744d8fac708d65ed51d03%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636800666735874264&amp;sdata=F55UaGVkDXaj4q7v7jUvPL50pD74GE90R7OGX%2FV3f%2Fs%3D&amp;reserved=0
> >> > > > > > > > > > >.
> >> > > > > > > > > > >>>
> >> > > > > > > > > > >>>
> >> > > > > > > > > > >>> Thank you Stanislav for making this happen!
> >> > > > > > > > > > >>>
> >> > > > > > > > > > >>>
> >> > > > > > > > > > >>> Boyang
> >> > > > > > > > > > >>>
> >> > > > > > > > > > >>> ________________________________
> >> > > > > > > > > > >>> From: Matt Farmer <m...@frmr.me>
> >> > > > > > > > > > >>> Sent: Tuesday, November 20, 2018 10:24 AM
> >> > > > > > > > > > >>> To: dev@kafka.apache.org
> >> > > > > > > > > > >>> Subject: Re: [Discuss] KIP-389: Enforce
> >> group.max.size
> >> > to
> >> > > > cap
> >> > > > > > > > member
> >> > > > > > > > > > >>> metadata growth
> >> > > > > > > > > > >>>
> >> > > > > > > > > > >>> Thanks for the KIP.
> >> > > > > > > > > > >>>
> >> > > > > > > > > > >>> Will this cap be a global cap across the entire
> >> cluster
> >> > > or
> >> > > > > per
> >> > > > > > > > > broker?
> >> > > > > > > > > > >>>
> >> > > > > > > > > > >>> Either way the default value seems a bit high to
> me,
> >> > but
> >> > > > that
> >> > > > > > > could
> >> > > > > > > > > > just
> >> > > > > > > > > > >>> be
> >> > > > > > > > > > >>> from my own usage patterns. I'd have probably
> >> started
> >> > > with
> >> > > > > 500
> >> > > > > > or
> >> > > > > > > > 1k
> >> > > > > > > > > > but
> >> > > > > > > > > > >>> could be easily convinced that's wrong.
> >> > > > > > > > > > >>>
> >> > > > > > > > > > >>> Thanks,
> >> > > > > > > > > > >>> Matt
> >> > > > > > > > > > >>>
> >> > > > > > > > > > >>> On Mon, Nov 19, 2018 at 8:51 PM Boyang Chen <
> >> > > > > > bche...@outlook.com
> >> > > > > > > >
> >> > > > > > > > > > wrote:
> >> > > > > > > > > > >>>
> >> > > > > > > > > > >>> > Hey folks,
> >> > > > > > > > > > >>> >
> >> > > > > > > > > > >>> >
> >> > > > > > > > > > >>> > I would like to start a discussion on KIP-389:
> >> > > > > > > > > > >>> >
> >> > > > > > > > > > >>> >
> >> > > > > > > > > > >>> >
> >> > > > > > > > > > >>>
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-389%253A%2BEnforce%2Bgroup.max.size%2Bto%2Bcap%2Bmember%2Bmetadata%2Bgrowth&amp;data=02%7C01%7C%7Cb603e099d6c744d8fac708d65ed51d03%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636800666735874264&amp;sdata=n%2FHp2DM4k48Q9hayOlc8q5VlcBKFtVWnLDOAzm%2FZ25Y%3D&amp;reserved=0
> >> > > > > > > > > > >>> >
> >> > > > > > > > > > >>> >
> >> > > > > > > > > > >>> > This is a pretty simple change to cap the
> consumer
> >> > > group
> >> > > > > size
> >> > > > > > > for
> >> > > > > > > > > > >>> broker
> >> > > > > > > > > > >>> > stability. Give me your valuable feedback when
> you
> >> > got
> >> > > > > time.
> >> > > > > > > > > > >>> >
> >> > > > > > > > > > >>> >
> >> > > > > > > > > > >>> > Thank you!
> >> > > > > > > > > > >>> >
> >> > > > > > > > > > >>>
> >> > > > > > > > > > >>
> >> > > > > > > > > > >>
> >> > > > > > > > > > >> --
> >> > > > > > > > > > >> Best,
> >> > > > > > > > > > >> Stanislav
> >> > > > > > > > > > >>
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > --
> >> > > > > > > > > > > Best,
> >> > > > > > > > > > > Stanislav
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > > --
> >> > > > > > > > > > Best,
> >> > > > > > > > > > Stanislav
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > > >
> >> > > > > > > > --
> >> > > > > > > > Best,
> >> > > > > > > > Stanislav
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > > --
> >> > > > > > Best,
> >> > > > > > Stanislav
> >> > > > > >
> >> > > > >
> >> > > >
> >> > > >
> >> > > > --
> >> > > > Best,
> >> > > > Stanislav
> >> > > >
> >> > >
> >> > >
> >> > > --
> >> > > Best,
> >> > > Stanislav
> >> > >
> >> >
> >> >
> >> > --
> >> > Best,
> >> > Stanislav
> >> >
> >>
> >
> >
> > --
> > Best,
> > Stanislav
> >
>
>
> --
> Best,
> Stanislav
>


-- 
Best,
Stanislav

Re: [Discuss] KIP-389: Enforce group.max.size to cap member metadata growth

Reply via email to