Hi Team, We have some Splunk dashboards along with custom UI elements to report Kafka health status. We forward all Kafka health check statuses to be loaded into Splunk. However, we are encountering capacity issues on Splunk as we service multiple Kafka clusters across our data center as well as three PCs (with weekly growth in deployment).
The only problem is that there is an hour or two of latency. While I have some ideas, I am trying to understand what you all use for monitoring broker status. The idea behind this is to provide real-time updates to all our Kafka installations (we will be upgrading to 3.6 soon). We prefer open-source solutions over vendor-locked options. Can you please share any best-known methods (BKMs) for achieving real-time cluster updates? Best, Vinay Bagare