Hi All, We're using the "org.apache.kafka.kafka-clients" library version "0.9.0.1". We're running 0.9.0.1 brokers.
I've been tracking down "org.apache.kafka.common.errors.TimeoutException: Batch Expired" issues in our producer. Based on my local testing, it looks like it doesn't actually retry messages when we lose the leader. Based on the docs it looks like it should be retrying these failures. I read through a number of issues relating to this, like KAFKA-3594 <https://issues.apache.org/jira/browse/KAFKA-3594> and KAFKA-4515 <https://issues.apache.org/jira/browse/KAFKA-4515>. Should we: - Fork the client and patch fixes for these issues? - Set "retries" to 0 and implement the retries at the application level? - Wait on a 0.9.0.2 release of the kafka client? - Some unknown other thing? If you've run into this issue, or have worked on the client, I'd love to hear from you! Thanks a ton, Andrew Clarkson