Hi, On Sat, Feb 28, 2015 at 9:16 AM, Gene Robichaux <gene.robich...@match.com> wrote:
> What is the best way to detect consumer lag? > > We are running each consumer as a separate group and I am running the > ConsumerOffsetChecker to assess the partitions and the lag for each > group/consumer. I run this every 5 minutes. In some cases I run this > command up to 75 times on each 5 min polling cycle (once for each > group/consuer). An example of the command is (bin/kafka-run-class.sh > kafka.tools.ConsumerOffsetChecker --group consumer-group1 --zkconnect > zkhost:zkport) > > The problem I am running into is CPU usage on the broker when these > commands run. We have a dedicated broker that has no leader partitions, but > the high CPU still concerns me. > > Is there a better way to detect consumer lag? Preferably one that is less > impactful? > Yeah, that hurts :(. I just looked at our SPM for Kafka monitoring to see specifically what we do for Consumer Lag. I'd send you the screenshot, but I think the ML blocks it. Ah, ah, you can actually see it in a demo, here's the link: https://apps.sematext.com/demo -- look for SPM apps with "Kafka" in the name and look for a tab on the left side labeled "Consumer Lag". But basically, you can slice and dice consumer lag by any combination of the following: * consumer hostname * client ID * topic * partition Minimal impact and you get your Consumer Lag in more or less RT. Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/