Hi friends! :)

I believe we currently have a gap in KafkaConsumer metrics for errors since
the KafkaConsumer is complex and are many places where things can go wrong.
Currently, these failures are logged and certain ones can be inferred from
the existing metrics (ex. heartbeat-rate).

This KIP seeks to improve monitoring and alerting for the consumer by
providing metrics for the Fetcher class.

https://cwiki.apache.org/confluence/display/KAFKA/KIP-356%3A+Add+KafkaConsumer+fetch-error-rate+and+fetch-error-total+metrics

There are also a few other places in the Fetcher where errors may happen
(parsing completed fetches, offset requests, etc) but it may be appropriate
to monitor them in separate metrics.

Any thoughts?

Thanks!

Regards,
Kevin

Reply via email to