SPM only works for Java consumers or, I guess consumers using the built-in offset management in kafka
On Tue, Mar 17, 2015 at 11:44 PM, Otis Gospodnetic < otis.gospodne...@gmail.com> wrote: > Mathias, > > SPM for Kafka will give you Consumer Offsets by Host, Consumer Id, Topic, > and Partition, and you can alert (thresholds and/or anomalies) on any > combination of these, and of course on any of the other 100+ Kafka metrics > there. > See http://blog.sematext.com/2015/02/10/kafka-0-8-2-monitoring/ > > Otis > -- > Monitoring * Alerting * Anomaly Detection * Centralized Log Management > Solr & Elasticsearch Support * http://sematext.com/ > > > On Tue, Mar 17, 2015 at 5:36 AM, Mathias Söderberg < > mathias.soederb...@gmail.com> wrote: > > > Hi Lance, > > > > I tried Kafka Offset Monitor a while back, but it didn't play especially > > nice with a lot of topics / partitions (we currently have around 1400 > > topics and 4000 partitions in total). Might be possible to make it work a > > bit better, but not sure it would be the best way to do alerting. > > > > Thanks for the tip though :). > > > > Best regards, > > Mathias > > > > > > On Mon, 16 Mar 2015 at 21:02 Lance Laursen <llaur...@rubiconproject.com> > > wrote: > > > > > Hey Mathias, > > > > > > Kafka Offset Monitor will give you a general idea of where your > consumer > > > group(s) are at: > > > > > > http://quantifind.com/KafkaOffsetMonitor/ > > > > > > However, I'm not sure how useful it will be with "a large number of > > topics" > > > / turning its output into a script that alerts upon a threshold. Could > > take > > > a look and see what they're doing though. > > > > > > On Mon, Mar 16, 2015 at 8:31 AM, Mathias Söderberg < > > > mathias.soederb...@gmail.com> wrote: > > > > > > > Good day, > > > > > > > > I'm looking into using SimpleConsumer#getOffsetsBefore and offsets > > > > committed in ZooKeeper for monitoring the lag of a consumer group. > > > > > > > > Our current use case is that we have a service that is continuously > > > > consuming messages of a large number of topics and persisting the > > > messages > > > > to S3 at somewhat regular intervals (depends on time and the total > size > > > of > > > > consumed messages for each partition). Offsets are committed to > > ZooKeeper > > > > after the messages have been persisted to S3. > > > > The partitions are of varying load, so a simple threshold based on > the > > > > number of messages we're lagging behind would be cumbersome to > maintain > > > due > > > > to the number of topics, and most likely prone to unnecessary alerts. > > > > > > > > Currently our broker configuration specifies log.roll.hours=1 and > > > > log.segment.bytes=1GB, and my proposed solution is to have a separate > > > > service that would iterate through all topics/partitions and use > > > > #getOffsetsBefore with a timestamp that is one (1) or two (2) hours > ago > > > and > > > > compare the first offset (which from my testing looks to be the > offset > > > that > > > > is closest in time, i.e. from the log segment that is closest to the > > > > timestamp given) with the one that is saved to ZooKeeper. > > > > It feels like a pretty solid solution, given that we just want a > rough > > > > estimate of how much we're lagging behind in time, so that we know > > > (again, > > > > roughly) how much time we have to fix whatever is broken before the > log > > > > segments are deleted by Kafka. > > > > > > > > Is there anyone doing monitoring similar to this? Are there any > obvious > > > > downsides of this approach that I'm not thinking about? Thoughts on > > > > alternatives? > > > > > > > > Best regards, > > > > Mathias > > > > > > > > > > -- *Kasper Mackenhauer Jacobsen*