Actually Rebalancing includes two steps, and we name them PrepareRebalance
and WaitSync (arguably they may not be the best names). But these two steps
together should be treated as the complete rebalance cycle.


Guozhang

On Mon, Jul 24, 2017 at 10:46 AM, Colin McCabe <cmcc...@apache.org> wrote:

> Hi all,
>
> I think maybe it makes sense to rename the "PreparingRebalance" consumer
> group state to "Rebalancing."  To me, "Preparing" implies that there
> will be a later "rebalance" state that follows-- but there is not.
> Since we're now exposing this state name publicly in these metrics,
> perhaps it makes sense to do this rename now.  Thoughts?
>
> best,
> Colin
>
>
> On Fri, Jul 21, 2017, at 13:52, Colin McCabe wrote:
> > That's a good point.  I revised the KIP to add metrics for all the group
> > states.
> >
> > best,
> > Colin
> >
> >
> > On Fri, Jul 21, 2017, at 12:08, Guozhang Wang wrote:
> > > Ah, that's right Jason.
> > >
> > > With that I can be convinced to add one metric per each state.
> > >
> > > Guozhang
> > >
> > > On Fri, Jul 21, 2017 at 11:44 AM, Jason Gustafson <ja...@confluent.io>
> > > wrote:
> > >
> > > > >
> > > > > "Dead" and "Empty" states are transient: groups usually only
> leaves in
> > > > this
> > > > > state for a short while and then being deleted or transited to
> other
> > > > > states.
> > > >
> > > >
> > > > This is not strictly true for the "Empty" state which we also use to
> > > > represent simple groups which only use the coordinator to store
> offsets. I
> > > > think we may as well cover all the states if we're going to cover
> any of
> > > > them specifically.
> > > >
> > > > -Jason
> > > >
> > > >
> > > >
> > > > On Fri, Jul 21, 2017 at 9:45 AM, Guozhang Wang <wangg...@gmail.com>
> wrote:
> > > >
> > > > > My two cents:
> > > > >
> > > > > "Dead" and "Empty" states are transient: groups usually only
> leaves in
> > > > this
> > > > > state for a short while and then being deleted or transited to
> other
> > > > > states.
> > > > >
> > > > > Since we have the existing "*NumGroups*" metric, `*NumGroups -
> > > > > **NumGroupsRebalancing
> > > > > - **NumGroupsAwaitingSync`* should cover the above three, where
> "Stable"
> > > > > should be contributing most of the counts: If we have a bug that
> causes
> > > > the
> > > > > num.Dead / Empty to keep increasing, then we would observe
> `NumGroups`
> > > > keep
> > > > > increasing which should be sufficient for alerting. And trouble
> shooting
> > > > of
> > > > > the issue could be relying on the log4j.
> > > > >
> > > > > *Guozhang*
> > > > >
> > > > > On Fri, Jul 21, 2017 at 7:19 AM, Ismael Juma <ism...@juma.me.uk>
> wrote:
> > > > >
> > > > > > Thanks for the KIP, Colin. This will definitely be useful. One
> > > > question:
> > > > > > would it be useful to have a metric for for the number of groups
> in
> > > > each
> > > > > > possible state? The KIP suggests "PreparingRebalance" and
> > > > "AwaitingSync".
> > > > > > That leaves "Stable", "Dead" and "Empty". Are those not useful?
> > > > > >
> > > > > > Ismael
> > > > > >
> > > > > > On Thu, Jul 20, 2017 at 6:52 PM, Colin McCabe <
> cmcc...@apache.org>
> > > > > wrote:
> > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > I posted "KIP-180: Add a broker metric specifying the number of
> > > > > consumer
> > > > > > > group rebalances in progress" for discussion:
> > > > > > >
> > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > > > 180%3A+Add+a+broker+metric+specifying+the+number+of+
> > > > > > > consumer+group+rebalances+in+progress
> > > > > > >
> > > > > > > Check it out.
> > > > > > >
> > > > > > > cheers,
> > > > > > > Colin
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > -- Guozhang
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
>



-- 
-- Guozhang

Reply via email to