Similar to other approaches, our service uses JMX via Jolokia and then we save the time-series data in Redis. Then we expose this in a number of ways including our dashboard, etc. We have found Redis to be quite good for a time-series backend for this purpose. This all gets setup automatically as part of our service, but it would also work very well stand-alone if you wanted to rig something similar yourself.
Ping me if you go this way, we can help. Thanks, Kenny Gorman Founder and CEO www.eventador.io > On Jun 20, 2017, at 9:51 AM, Todd Palino <tpal...@gmail.com> wrote: > > Not for monitoring Kafka. We pull the JMX metrics two ways - one is a > container that wraps around the Kafka application and annotates the beans > to be emitted to Kafka as metrics, which gets pulled into our > autometrics/InGraphs system for graphing. But for alerting, we use an agent > that polls the critical metrics via JMX and pushes them into a separate > system (that doesn’t use Kafka). ELK is used for log analysis for other > applications. > > Kafka-monitor is what we built/use for synthetic traffic monitoring for > availability. And Burrow for monitoring consumers. > > -Todd > > > On Tue, Jun 20, 2017 at 9:53 AM, Andrew Hoblitzell < > ahoblitz...@salesforce.com> wrote: > >> Using Elasticsearch, Logstash, and Kibana is a pretty popular pattern at >> LinkedIn. >> >> Also giving honorable mentions to Kafka Monitor and Kafka Manager since >> they hadn't been mentioned yet >> https://github.com/yahoo/kafka-manager >> https://github.com/linkedin/kafka-monitor >> >> Thanks, >> >> Andrew Hoblitzell >> Sr. Software Engineer, Salesforce >> >> >> On Tue, Jun 20, 2017 at 9:37 AM, Todd S <t...@borked.ca> wrote: >> >>> You can look at enabling JMX on kafka ( >>> https://stackoverflow.com/questions/36708384/enable-jmx-on-kafka-brokers >> ) >>> using >>> JMXTrans (https://github.com/jmxtrans/jmxtrans) and a config ( >>> https://github.com/wikimedia/puppet-kafka/blob/master/ >>> kafka-jmxtrans.json.md) >>> to gather stats, and insert them into influxdb ( >>> https://www.digitalocean.com/community/tutorials/how-to- >>> monitor-system-metrics-with-the-tick-stack-on-centos-7) >>> then graph the resulsts with grafana ( >>> https://softwaremill.com/monitoring-apache-kafka-with-influxdb-grafana/, >>> https://grafana.com/dashboards/721) >>> >>> This is likely a solid day of work to get working nicely, but it also >>> enables you to do a lot of extra cool stuff for monitoring, more than >> just >>> Kafka. JMXTrans can be a bit of a pain, because Kafkas JMX metrics are >> .. >>> plentiful ... but the example configuration above should get you started. >>> Using Telegraf to collect system stats and graph them with Grafana is >>> really simple and powerful, as the Grafana community has a lot of >> pre-built >>> content you can steal and make quick wins with. >>> >>> Monitoring Kafka can be a beast, but there is a lot of useful data there >>> for if(when?) there is a problem. The more time you spend with the >>> metrics, the more you start to get a feel for the internals. >>> >>> On Mon, Jun 19, 2017 at 6:52 PM, Muhammad Arshad < >>> muhammad.ars...@alticeusa.com> wrote: >>> >>>> Hi, >>>> >>>> wanted to see if there is Kafka monitoring which is available. I am >>>> looking to the following: >>>> >>>> >>>> >>>> how much data came in at a certain time. >>>> >>>> >>>> >>>> Thanks, >>>> >>>> *Muhammad Faisal Arshad* >>>> >>>> Manager, Enterprise Data Quality >>>> >>>> Data Services & Architecture >>>> >>>> [image: >>>> http://www.multichannel.com/sites/default/files/public/ >>> styles/blog_content/public/Altice-NewLogo2017_RESIZED_0. >> jpg?itok=RmwvsCI6] >>>> >>>> >>>> >>>> >>>> -------------------------------------------------------- >>>> The information transmitted in this email and any of its attachments is >>>> intended only for the person or entity to which it is addressed and may >>>> contain information concerning Altice USA and/or its affiliates and >>>> subsidiaries that is proprietary, privileged, confidential and/or >> subject >>>> to copyright. Any review, retransmission, dissemination or other use >> of, >>> or >>>> taking of any action in reliance upon, this information by persons or >>>> entities other than the intended recipient(s) is prohibited and may be >>>> unlawful. If you received this in error, please contact the sender >>>> immediately and delete and destroy the communication and all of the >>>> attachments you have received and all copies thereof. >>>> -------------------------------------------------------- >>>> >>>> >>> >> > > > > -- > *Todd Palino* > Senior Staff Engineer, Site Reliability > Data Infrastructure Streaming > > > > linkedin.com/in/toddpalino