FYI..
We wrote a library that is essentially the exact same thing as metrics. The
only reason we didn't use metrics was because it didn't exist yet. There is
a graphite reporter which could be purposed for metrics.
https://github.com/ticketfly/pillage
https://github.com/Ticketfly/pillage/blob/mas
I have not run into this issue with Kafka but have definitely run into
issues with ZK expiring sessions and needing to diagnose why. Looking at GC
is obviously very important for this. When you turn on gc logging make sure
that you include a timestamp in the gc.log filename in your start script.
By