I am referring to the "Lag" that can exist when a Consumer Offset is 
significantly less than the Log Size. This difference is Lag and is often 
symptomatic of a problem - processing has stopped or being overwhelmed etc.

Our Legacy Node.js system uses Consumer Groups, (same as say older Spark).  To 
get the Offset, we can use kafka-consumer-groups.sh tool to get the offset. For 
Ops related work we use Kafkamon for these to get a UI up for Ops folks. 

Our newer stuff uses Samza and I see zero Consumer Groups. Instead I see 
checkpoint topics (example: __samza_checkpoint_ver_1_for_generic-delivery_1). I 
can consume this topic and get the current offset by partition, but I don't 
have the log size, so cannot compute the lag. All I can do is see these numbers 
increment but know clue how behind my process is.

I just took Linkedin's Burrow (https://github.com/linkedin/Burrow) for a test 
drive locally, hoping it would solve my problem due to it looking at the 
internal consumers. However, I have the same problem - can't get data on a 
consumer group that doesn't exist.

 



Jeremiah Adams
Software Engineer
www.helixeducation.com
Blog | Twitter | Facebook | LinkedIn

________________________________________
From: Tom Davis <t...@recursivedream.com>
Sent: Monday, November 26, 2018 6:59 PM
To: dev@samza.apache.org
Subject: Re: Alerting and Monitoring Samza Checkpointing?

Have you looked into KafkaSystemConsumerMetrics? Is the meaning of "lag"
there different from what you mean?


Jeremiah Adams <jad...@helixeducation.com> writes:

> We are replacing a node.js app that consumed topics on a Kafka cluster with
> Samza jobs. We use kafka-offsets to trigger alerts based on message lag. e.g.,
> message lag is greater than 10, wake up support persons.
>
>
> Samza doesn't use the same mechanism for offset storage and the tools for
> examining a topic's checkpoint aren't readily useful for application
> consumption.
>
>
> Can some of you share your approaches to monitoring and alerting on consumer
> lag?
>
>
> Regards.
>
>
> Jeremiah Adams
> Software Engineer
> https://url.emailprotection.link/?ahfhEufaAWbezBrUFPG98ZJcterGfIerU3ZwsA3Gv_C0~<https://url.emailprotection.link/?a49H2rNGIIBtQOw6md8OcHp-qKE3Xn2gNiZ3dlqAeSDA~>
> Blog<https://url.emailprotection.link/?a49H2rNGIIBtQOw6md8OcHgFEZu-KYuiu8doY66NWwmmyWxz7kC-27Yfnbdgd2wyh5gjXUa6LMT_NRXsj1g1VVg~~>
>  |
> Twitter<https://url.emailprotection.link/?a0Q7ct5_6cOdbJ86kpWB0zx6RbtgugTVC7lU_W7za50jLdZQGpLgVlR1V06zckSaM5oOKb6QBo46Qp9xt0Tt7Aw~~>
>  |
> Facebook<https://url.emailprotection.link/?aAmyAO_nS_C1aDgBLeKyGTu0tksTt1_mn2PcS8KJXNJPM04iRHKgX96qGgENV-dMSER5wl8zDVRr3RsS0OmcF9A~~>
>  |
> LinkedIn<https://url.emailprotection.link/?aanlcNI-cN74Gdz-TD332xAl6lHu7TRNICWoHUFjYf-KlBjrCGHoYR65b3rl-OyW10nWFv6hwYvUSoVHL4b3vGA~~>

Reply via email to