Thanks Joel! I found this configuration setting in "Producer Configs". I guess it means each producer sets this parameter as part of connection settings, like a number of acks.
I checked the information in Zookeeper and found out that 2 of the brokers are missing. The VMs with these brokers are not quite ... healthy (I cannot find another definition for this situation). I checked the information about replicas distribution and there are 3 replicas for each partition, so that part is ok. Started tests again and get some of producers stuck again. May be there is something wrong with my cluster of VMs. To reproduce the situation with original performance test tools: - start a Zookeeper node on 1 VM - start 2 Kafka brokers on 2 VMs - create a topic with multiple partitions and replication factor 2 - run producer performance script on 4 VMs in a sync mode with 2 acks to send 1M messages -----Original Message----- From: Joel Koshy [mailto:jjkosh...@gmail.com] Sent: Wednesday, February 12, 2014 3:40 PM To: users@kafka.apache.org Subject: Re: Sync producers stuck waiting for 2 acks The request time out config is request.timeout.ms - defaults to 10 seconds - so the request should expire by then and return a response. Can you run the list-topics command on the topics you are sending to and make sure there are at least two replicas in ISR while you are running your producer test? You mentioned you were able to reproduce this with the original performance tools - can you provide exact steps to reproduce if the above information does not help resolve this? Joel On Wed, Feb 12, 2014 at 11:22:39PM +0000, Michael Popov wrote: > I am running a test deployment of Kafka 0.8. When I configure sync producers > to expect 2 acks for each "write" request, some of the producers get stuck. > It looks like broker's response is not delivered back. > This happened with original Kafka performance tools and with a test tool > built using a custom C# client library. So I assume the issue is not on the > client side. > I checked the sources. Even if a replica broker does not catch up with a > leader, a producer request should expire on time out. I don't see > configuration parameter to set this timeout. The closest configuration > setting I can see is "producer.purgatory.purge.interval.requests" but it is > in the number of requests, not time units. > > I would appreciate any advice where to look for the problem and how to solve > it. > > Thank you, > Michael Popov