Hi, After upgrading to 1.0 we're getting strange producer/broker behaviour not experienced on <1.0.
As a test we run a single threaded producer just sending "TEST" against our cluster with the following producer settings, on a topic with replica's=3 and min.isr=2: linger.ms=10 acks=all retries=1000 batch=16k retry.backoff.ms=1000 Using the callback on send we immediately see a huge lag in the amount of acks coming back(600k+), while on 0.11 this hovers around 4k-20k max). At the same time we see a drop in the producer sending msg/s, in about 90seconds this drops to 0. After 10minutes of silence all we see a list of network exceptions like these on all partitions: "Got error produce response with correlation id X on topic-partition test-topic, retrying (999 attempts left). Error: NETWORK_EXCEPTION" Then short continuation on sends but quickly the same behaviour. Now for the kicker: Staring another thread after the first experiences this, producing on the same topic, same groupid, will 'release' the first thread and all acks are returned as normal and behaviour returns to normal. No issues are experienced when acks=1. Kafka logs show no issues at default log levels, havent had the opportunity to test further of with more fine grained log levels. The brokers run default settings with maybe the special that inter broker protocol is 1.0, but client protocol is still set to 0.9.0. Testing done above is with client ranging from 0.9 upto 1.0, all showing the same behaviour. Downgrading the entire cluster back to same settings, same clients, same tests and all is well. Could this be a bug? Thanks, Rob