Thanks for the response Jens. That is one of the first things we looked at, was 
the ZK performance. ZK seems nominal, low load and no exhaustion of any 
resources. ZK offset commit is not set by us explicitly, so default.

Kind regards,

Jahn Roux


-----Original Message-----
From: Jens Rantil [mailto:jens.ran...@tink.se]
Sent: Monday, May 23, 2016 11:00 PM
To: users@kafka.apache.org
Subject: Re: Large kafka deployment on virtual hardware

Hi Jahn,

How is the load on Zookeeper? How often are you committing your offsets?
Could that be an issue?

Cheers,
Jens

Den mån 23 maj 2016 18:12Jahn Roux <j...@comprsa.com> skrev:

> I have a large Kafka deployment on virtual hardware: 120 brokers on
> 32gb memory 8 core virtual machines. Gigabit network, RHEL 6.7. 4
> Topics, 1200 partitions each, replication factor of 2 and running
> Kafka 0.8.1.2
>
>
>
> We are running into issues where our cluster is not keeping up. We
> have 4 sets of producers (30 producers per set) set to produce to the
> 4 topics (producers produce to multiple topics). The messages are
> about 150 byte on average and we are attempting to produce between 1
> million and 2 million messages a second per producer set.
>
>
>
> We run into issues after about 1 million messages a second - just for
> that producer set, the producer buffers fill up and we are blocked
> from producing messages. This does not seem to impact the other
> producer sets - they run without issues until they too reach about 1m
> messages a second.
>
>
>
> Looking at the metrics available to us we do not see a bottleneck, we
> don't see disk I/O maxing out, CPU and network are nominal. We have
> tried increasing and decreasing the Kafka cluster size to no avail, we
> have gone from 100 partitions to 1200 partitions per topic. We have
> increased and decreased the number of producers and yet we run into
> the same issues. Our Kafka config is mostly out the box - 1 hour log
> roll/retention, increased the buffer sizes a bit but other than that it's out 
> the box.
>
>
>
> I was wondering if someone has some recommendations for identifying
> the bottleneck and/or what configuration values we should be taking a look at?
> Is there known issues with Kafka on virtualized hardware or things to
> watch out for when deploying to VMs? Are there use cases where Kafka
> is being used in a similar way - +4 million messages a second of
> discrete 150 byte messages?
>
>
>
> Kind regards,
>
>
>
> Jahn Roux
>
>
>
>
>
> ---
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus
>
--

Jens Rantil
Backend Developer @ Tink

Tink AB, Wallingatan 5, 111 60 Stockholm, Sweden For urgent matters you can 
reach me at +46-708-84 18 32.


---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

Reply via email to