Hey folks,

We're observing a very peculiar behavior on our Kafka cluster. When one of
the Kafka broker instances goes down, we're seeing the producer block (at
.flush) for right about `request.timeout.ms` before returning success (or
at least not throwing an exception) and moving on.

We're running Kafka on Kubernetes, so this may be related. Kafka is a
Kubernetes PetSet with a global Service (like a load balancer) for
consumers/producers to use for the bootstrap list. Our Kafka brokers are
configured to come up with a predetermined set of broker ids (kafka-0,
kafka-1 & kafka-2), but the IP likely changes every time it's restarted.

Our Kafka settings are as follows:
Producer:
"acks" "all"
"batch.size" "16384"
"linger.ms" "1"
"request.timeout.ms" "3000"
"max.in.flight.requests.per.connection" "1"
"retries" "2"
"max.block.ms" "10000"
"buffer.memory" "33554432"

Broker:
min.insync.replicas=1

I'm having a bit of a hard time debugging why this happens, mostly because
I'm not seeing any logs from the producer. Is there a guide somewhere for
turning up the logging information from the kafka java client? I'm using
logback if that helps.

Thanks,
Mike.

Ladder <http://bit.ly/1VRtWfS>. The smart, modern way to insure your life.

Reply via email to