Re: hitting the throughput limit on a cluster?

Jon Yeargers Tue, 21 Feb 2017 12:52:35 -0800

Thanks for looking at this issue. I checked the max IOPs for this disk and
we're only at about 10%. I can add more disks to spread out the work.


What IOWait values should I be aiming for?

Also - what do you set openfiles to? I have it at 65535 but I just read a
doc that suggested > 100K is better


On Tue, Feb 21, 2017 at 10:45 AM, Todd Palino <tpal...@gmail.com> wrote:

> So I think the important thing to look at here is the IO wait on your
> system. You’re hitting disk throughput issues, and that’s what you most
> likely need to resolve. So just from what you’ve described, I think the
> only thing that is going to get you more performance is more spindles (or
> faster spindles). This is either more disks or more brokers, but at the end
> of it you need to eliminate the disk IO bottleneck.
>
> -Todd
>
>
> On Tue, Feb 21, 2017 at 7:29 AM, Jon Yeargers <jon.yearg...@cedexis.com>
> wrote:
>
> > Running 3x 8core on google compute.
> >
> > Topic has 16 partitions (replication factor 2) and is consumed by 16
> docker
> > containers on individual hosts.
> >
> > System seems to max out at around 40000 messages / minute. Each message
> is
> > ~12K - compressed (snappy) JSON.
> >
> > Recently moved from 12 to the above 16 partitions with no change in
> > throughput.
> >
> > Also tried increased the consumption capacity on each container by 50%.
> No
> > effect.
> >
> > Network is running at ~6Gb/sec (measured using iperf3). Broker load is
> > ~1.5. IOWait % is 5-10 (via sar).
> >
> > What are my options for adding throughput?
> >
> > - more brokers?
> > - avro/protobuf messaging?
> > - more disks / broker? (1 / host presently)
> > - jumbo frames?
> >
> > (transparent huge pages is disabled)
> >
> >
> > Looking at this article (
> > https://engineering.linkedin.com/kafka/benchmarking-apache-
> > kafka-2-million-writes-second-three-cheap-machines)
> > it would appear that for our message size we are at the max. This would
> > argue that we need to shrink the message size - so perhaps switching to
> > avro is the next step?
> >
>
>
>
> --
> *Todd Palino*
> Staff Site Reliability Engineer
> Data Infrastructure Streaming
>
>
>
> linkedin.com/in/toddpalino
>

Re: hitting the throughput limit on a cluster?

Reply via email to