My two cents:

"Dead" and "Empty" states are transient: groups usually only leaves in this
state for a short while and then being deleted or transited to other states.

Since we have the existing "*NumGroups*" metric, `*NumGroups -
**NumGroupsRebalancing
- **NumGroupsAwaitingSync`* should cover the above three, where "Stable"
should be contributing most of the counts: If we have a bug that causes the
num.Dead / Empty to keep increasing, then we would observe `NumGroups` keep
increasing which should be sufficient for alerting. And trouble shooting of
the issue could be relying on the log4j.

*Guozhang*

On Fri, Jul 21, 2017 at 7:19 AM, Ismael Juma <ism...@juma.me.uk> wrote:

> Thanks for the KIP, Colin. This will definitely be useful. One question:
> would it be useful to have a metric for for the number of groups in each
> possible state? The KIP suggests "PreparingRebalance" and "AwaitingSync".
> That leaves "Stable", "Dead" and "Empty". Are those not useful?
>
> Ismael
>
> On Thu, Jul 20, 2017 at 6:52 PM, Colin McCabe <cmcc...@apache.org> wrote:
>
> > Hi all,
> >
> > I posted "KIP-180: Add a broker metric specifying the number of consumer
> > group rebalances in progress" for discussion:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 180%3A+Add+a+broker+metric+specifying+the+number+of+
> > consumer+group+rebalances+in+progress
> >
> > Check it out.
> >
> > cheers,
> > Colin
> >
>



-- 
-- Guozhang

Reply via email to