Curious if you tested with larger message sizes, like around 20-30kb (you
mentioned 2kb).

Any numbers on that size?


On Thu, May 23, 2013 at 10:12 AM, Jason Weiss <jason_we...@rapid7.com>wrote:

> Bummer.
>
> Yes, but it will be several days. I'll post back to the forum with a URL
> once I'm done.
>
> Jason
>
>
>
> On 5/23/13 10:11 AM, "Jun Rao" <jun...@gmail.com> wrote:
>
> >Jason,
> >
> >Unfortunately, Apache mailing lists don't support attachments. Could you
> >document your experience (with the graphs) in a blog (or a wiki page in
> >Kafka)?
> >
> >Thanks,
> >
> >Jun
> >
> >
> >On Thu, May 23, 2013 at 2:00 AM, Jason Weiss <jason_we...@rapid7.com>
> >wrote:
> >
> >> Jun,
> >>
> >> Here is a screenshot from AWS's statistics (per-minute sampling is the
> >> finest granularity I believe that they chart). I don't have a
> >>screenshot of
> >> the top output.
> >>
> >> This shows when I added a 4th machine to the cluster with the same
> >>number
> >> of clients, my CPU utilization fell- but remained constant. The
> >>flatline is
> >> pretty obvious in the extended 4 minute test-- it ramps up, flat lines,
> >> then ramps down.
> >>
> >> Jason
> >>
> >> ________________________________________
> >> From: Jun Rao [jun...@gmail.com]
> >> Sent: Thursday, May 23, 2013 00:17
> >> To: users@kafka.apache.org
> >> Subject: Re: Apache Kafka in AWS
> >>
> >> Jason,
> >>
> >> Thanks for sharing. This is very interesting. Normally, Kafka brokers
> >>don't
> >> use too much CPU. Are most of the 750% CPU actually used by Kafka
> >>brokers?
> >>
> >> Jun
> >>
> >>
> >> On Wed, May 22, 2013 at 6:11 PM, Jason Weiss <jason_we...@rapid7.com>
> >> wrote:
> >>
> >> > >>Did you check that you were using all cores?
> >> >
> >> > top was reporting over 750%
> >> >
> >> > Jason
> >> >
> >> > ________________________________________
> >> > From: Ken Krugler [kkrugler_li...@transpac.com]
> >> > Sent: Wednesday, May 22, 2013 20:59
> >> > To: users@kafka.apache.org
> >> > Subject: Re: Apache Kafka in AWS
> >> >
> >> > Hi Jason,
> >> >
> >> > On May 22, 2013, at 3:35pm, Jason Weiss wrote:
> >> >
> >> > > Ken,
> >> > >
> >> > > Great question! I should have indicated I was using EBS, 500GB with
> >> 2000
> >> > provisioned IOPs.
> >> >
> >> > OK, thanks. Sounds like you were pegged on CPU usage.
> >> >
> >> > But that does surprise me a bit. Did you check that you were using all
> >> > cores?
> >> >
> >> > Thanks,
> >> >
> >> > -- Ken
> >> >
> >> > PS - back in 2006 I spent a week of hell debugging an occasion job
> >> failure
> >> > on Hadoop (this is when it was still part of Nutch). Turns out one of
> >>our
> >> > 12 slaves was accidentally using OpenJDK, and this had a JIT compiler
> >>bug
> >> > that would occasionally rear its ugly head. Obviously the Sun/Oracle
> >>JRE
> >> > isn't bug-free, but it gets a lot more stress testing. So one of my
> >>basic
> >> > guidelines in the ops portion of the Hadoop class I teach is that
> >>every
> >> > server must have exactly the same version of Oracle's JRE.
> >> >
> >> > > ________________________________________
> >> > > From: Ken Krugler [kkrugler_li...@transpac.com]
> >> > > Sent: Wednesday, May 22, 2013 17:23
> >> > > To: users@kafka.apache.org
> >> > > Subject: Re: Apache Kafka in AWS
> >> > >
> >> > > Hi Jason,
> >> > >
> >> > > Thanks for the notes.
> >> > >
> >> > > I'm curious whether you went with using local drives (ephemeral
> >> storage)
> >> > or EBS, and if with EBS then what IOPS.
> >> > >
> >> > > Thanks,
> >> > >
> >> > > -- Ken
> >> > >
> >> > > On May 22, 2013, at 1:42pm, Jason Weiss wrote:
> >> > >
> >> > >> All,
> >> > >>
> >> > >> I asked a number of questions of the group over the last week, and
> >>I'm
> >> > happy to report that I've had great success getting Kafka up and
> >>running
> >> in
> >> > AWS. I am using 3 EC2 instances, each of which is a M2 High-Memory
> >> > Quadruple Extra Large with 8 cores and 58.4 GiB of memory according to
> >> the
> >> > AWS specs. I have co-located Zookeeper instances next to Zafka on each
> >> > machine.
> >> > >>
> >> > >> I am able to publish in a repeatable fashion 273,000 events per
> >> second,
> >> > with each event payload consisting of a fixed size of 2048 bytes! This
> >> > represents the maximum throughput possible on this configuration, as
> >>the
> >> > servers became CPU constrained, averaging 97% utilization in a
> >>relatively
> >> > flat line. This isn't a "burst" speed ­ it represents a sustained
> >> > throughput from 20 M1 Large EC2 Kafka multi-threaded producers.
> >>Putting
> >> > this into perspective, if my log retention period was a month, I'd be
> >> > aggregating 1.3 petabytes of data on my disk drives. Suffice to say, I
> >> > don't see us retaining data for more than a few hours!
> >> > >>
> >> > >> Here were the keys to tuning for future folks to consider:
> >> > >>
> >> > >> First and foremost, be sure to configure your Java heap size
> >> > accordingly when you launch Kafka. The default is like 512MB, which
> >>in my
> >> > case left virtually all of my RAM inaccessible to Kafka.
> >> > >> Second, stay away from OpenJDK. No, seriously ­ this was a huge
> >>thorn
> >> > in my side, and I almost gave up on Kafka because of the problems I
> >> > encountered. The OpenJDK NIO functions repeatedly resulted in Kafka
> >> > crashing and burning in dramatic fashion. The moment I switched over
> >>to
> >> > Oracle's JDK for linux, Kafka didn't puke once- I mean, like not even
> >>a
> >> > hiccup.
> >> > >> Third know your message size. In my opinion, the more you
> >>understand
> >> > about your event payload characteristics, the better you can tune the
> >> > system. The two knobs to really turn are the log.flush.interval and
> >> > log.default.flush.interval.ms. The values here are intrinsically
> >> > connected to the types of payloads you are putting through the system.
> >> > >> Fourth and finally, to maximize throughput you have to code against
> >> the
> >> > async paradigm, and be prepared to tweak the batch size, queue
> >> properties,
> >> > and compression codec (wait for itŠ) in a way that matches the message
> >> > payload you are putting through the system and the capabilities of the
> >> > producer system itself.
> >> > >>
> >> > >>
> >> > >> Jason
> >> > >>
> >> > >>
> >> > >>
> >> > >>
> >> > >>
> >> > >> This electronic message contains information which may be
> >>confidential
> >> > or privileged. The information is intended for the use of the
> >>individual
> >> or
> >> > entity named above. If you are not the intended recipient, be aware
> >>that
> >> > any disclosure, copying, distribution or use of the contents of this
> >> > information is prohibited. If you have received this electronic
> >> > transmission in error, please notify us by e-mail at (
> >> > postmas...@rapid7.com) immediately.
> >> > >
> >> > > --------------------------
> >> > > Ken Krugler
> >> > > +1 530-210-6378
> >> > > http://www.scaleunlimited.com
> >> > > custom big data solutions & training
> >> > > Hadoop, Cascading, Cassandra & Solr
> >> > >
> >> > >
> >> > >
> >> > >
> >> > >
> >> > > This electronic message contains information which may be
> >>confidential
> >> > or privileged. The information is intended for the use of the
> >>individual
> >> or
> >> > entity named above. If you are not the intended recipient, be aware
> >>that
> >> > any disclosure, copying, distribution or use of the contents of this
> >> > information is prohibited. If you have received this electronic
> >> > transmission in error, please notify us by e-mail at (
> >> > postmas...@rapid7.com) immediately.
> >> > >
> >> >
> >> > --------------------------
> >> > Ken Krugler
> >> > +1 530-210-6378
> >> > http://www.scaleunlimited.com
> >> > custom big data solutions & training
> >> > Hadoop, Cascading, Cassandra & Solr
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > This electronic message contains information which may be
> >>confidential or
> >> > privileged. The information is intended for the use of the individual
> >>or
> >> > entity named above. If you are not the intended recipient, be aware
> >>that
> >> > any disclosure, copying, distribution or use of the contents of this
> >> > information is prohibited. If you have received this electronic
> >> > transmission in error, please notify us by e-mail at (
> >> > postmas...@rapid7.com) immediately.
> >> >
> >> >
> >> This electronic message contains information which may be confidential
> >>or
> >> privileged. The information is intended for the use of the individual or
> >> entity named above. If you are not the intended recipient, be aware that
> >> any disclosure, copying, distribution or use of the contents of this
> >> information is prohibited. If you have received this electronic
> >> transmission in error, please notify us by e-mail at (
> >> postmas...@rapid7.com) immediately.
> >>
>
> This electronic message contains information which may be confidential or
> privileged. The information is intended for the use of the individual or
> entity named above. If you are not the intended recipient, be aware that
> any disclosure, copying, distribution or use of the contents of this
> information is prohibited. If you have received this electronic
> transmission in error, please notify us by e-mail at (
> postmas...@rapid7.com) immediately.
>
>

Reply via email to