Is the code you used to benchmark open source by any chance?

On Tue, May 28, 2013 at 4:29 PM, Jason Weiss <jason_we...@rapid7.com> wrote:

> Nope, sorry.
>
>
> ________________________________________
> From: S Ahmed [sahmed1...@gmail.com]
> Sent: Tuesday, May 28, 2013 15:47
> To: users@kafka.apache.org
> Subject: Re: Apache Kafka in AWS
>
> Curious if you tested with larger message sizes, like around 20-30kb (you
> mentioned 2kb).
>
> Any numbers on that size?
>
>
> On Thu, May 23, 2013 at 10:12 AM, Jason Weiss <jason_we...@rapid7.com
> >wrote:
>
> > Bummer.
> >
> > Yes, but it will be several days. I'll post back to the forum with a URL
> > once I'm done.
> >
> > Jason
> >
> >
> >
> > On 5/23/13 10:11 AM, "Jun Rao" <jun...@gmail.com> wrote:
> >
> > >Jason,
> > >
> > >Unfortunately, Apache mailing lists don't support attachments. Could you
> > >document your experience (with the graphs) in a blog (or a wiki page in
> > >Kafka)?
> > >
> > >Thanks,
> > >
> > >Jun
> > >
> > >
> > >On Thu, May 23, 2013 at 2:00 AM, Jason Weiss <jason_we...@rapid7.com>
> > >wrote:
> > >
> > >> Jun,
> > >>
> > >> Here is a screenshot from AWS's statistics (per-minute sampling is the
> > >> finest granularity I believe that they chart). I don't have a
> > >>screenshot of
> > >> the top output.
> > >>
> > >> This shows when I added a 4th machine to the cluster with the same
> > >>number
> > >> of clients, my CPU utilization fell- but remained constant. The
> > >>flatline is
> > >> pretty obvious in the extended 4 minute test-- it ramps up, flat
> lines,
> > >> then ramps down.
> > >>
> > >> Jason
> > >>
> > >> ________________________________________
> > >> From: Jun Rao [jun...@gmail.com]
> > >> Sent: Thursday, May 23, 2013 00:17
> > >> To: users@kafka.apache.org
> > >> Subject: Re: Apache Kafka in AWS
> > >>
> > >> Jason,
> > >>
> > >> Thanks for sharing. This is very interesting. Normally, Kafka brokers
> > >>don't
> > >> use too much CPU. Are most of the 750% CPU actually used by Kafka
> > >>brokers?
> > >>
> > >> Jun
> > >>
> > >>
> > >> On Wed, May 22, 2013 at 6:11 PM, Jason Weiss <jason_we...@rapid7.com>
> > >> wrote:
> > >>
> > >> > >>Did you check that you were using all cores?
> > >> >
> > >> > top was reporting over 750%
> > >> >
> > >> > Jason
> > >> >
> > >> > ________________________________________
> > >> > From: Ken Krugler [kkrugler_li...@transpac.com]
> > >> > Sent: Wednesday, May 22, 2013 20:59
> > >> > To: users@kafka.apache.org
> > >> > Subject: Re: Apache Kafka in AWS
> > >> >
> > >> > Hi Jason,
> > >> >
> > >> > On May 22, 2013, at 3:35pm, Jason Weiss wrote:
> > >> >
> > >> > > Ken,
> > >> > >
> > >> > > Great question! I should have indicated I was using EBS, 500GB
> with
> > >> 2000
> > >> > provisioned IOPs.
> > >> >
> > >> > OK, thanks. Sounds like you were pegged on CPU usage.
> > >> >
> > >> > But that does surprise me a bit. Did you check that you were using
> all
> > >> > cores?
> > >> >
> > >> > Thanks,
> > >> >
> > >> > -- Ken
> > >> >
> > >> > PS - back in 2006 I spent a week of hell debugging an occasion job
> > >> failure
> > >> > on Hadoop (this is when it was still part of Nutch). Turns out one
> of
> > >>our
> > >> > 12 slaves was accidentally using OpenJDK, and this had a JIT
> compiler
> > >>bug
> > >> > that would occasionally rear its ugly head. Obviously the Sun/Oracle
> > >>JRE
> > >> > isn't bug-free, but it gets a lot more stress testing. So one of my
> > >>basic
> > >> > guidelines in the ops portion of the Hadoop class I teach is that
> > >>every
> > >> > server must have exactly the same version of Oracle's JRE.
> > >> >
> > >> > > ________________________________________
> > >> > > From: Ken Krugler [kkrugler_li...@transpac.com]
> > >> > > Sent: Wednesday, May 22, 2013 17:23
> > >> > > To: users@kafka.apache.org
> > >> > > Subject: Re: Apache Kafka in AWS
> > >> > >
> > >> > > Hi Jason,
> > >> > >
> > >> > > Thanks for the notes.
> > >> > >
> > >> > > I'm curious whether you went with using local drives (ephemeral
> > >> storage)
> > >> > or EBS, and if with EBS then what IOPS.
> > >> > >
> > >> > > Thanks,
> > >> > >
> > >> > > -- Ken
> > >> > >
> > >> > > On May 22, 2013, at 1:42pm, Jason Weiss wrote:
> > >> > >
> > >> > >> All,
> > >> > >>
> > >> > >> I asked a number of questions of the group over the last week,
> and
> > >>I'm
> > >> > happy to report that I've had great success getting Kafka up and
> > >>running
> > >> in
> > >> > AWS. I am using 3 EC2 instances, each of which is a M2 High-Memory
> > >> > Quadruple Extra Large with 8 cores and 58.4 GiB of memory according
> to
> > >> the
> > >> > AWS specs. I have co-located Zookeeper instances next to Zafka on
> each
> > >> > machine.
> > >> > >>
> > >> > >> I am able to publish in a repeatable fashion 273,000 events per
> > >> second,
> > >> > with each event payload consisting of a fixed size of 2048 bytes!
> This
> > >> > represents the maximum throughput possible on this configuration, as
> > >>the
> > >> > servers became CPU constrained, averaging 97% utilization in a
> > >>relatively
> > >> > flat line. This isn't a "burst" speed ­ it represents a sustained
> > >> > throughput from 20 M1 Large EC2 Kafka multi-threaded producers.
> > >>Putting
> > >> > this into perspective, if my log retention period was a month, I'd
> be
> > >> > aggregating 1.3 petabytes of data on my disk drives. Suffice to
> say, I
> > >> > don't see us retaining data for more than a few hours!
> > >> > >>
> > >> > >> Here were the keys to tuning for future folks to consider:
> > >> > >>
> > >> > >> First and foremost, be sure to configure your Java heap size
> > >> > accordingly when you launch Kafka. The default is like 512MB, which
> > >>in my
> > >> > case left virtually all of my RAM inaccessible to Kafka.
> > >> > >> Second, stay away from OpenJDK. No, seriously ­ this was a huge
> > >>thorn
> > >> > in my side, and I almost gave up on Kafka because of the problems I
> > >> > encountered. The OpenJDK NIO functions repeatedly resulted in Kafka
> > >> > crashing and burning in dramatic fashion. The moment I switched over
> > >>to
> > >> > Oracle's JDK for linux, Kafka didn't puke once- I mean, like not
> even
> > >>a
> > >> > hiccup.
> > >> > >> Third know your message size. In my opinion, the more you
> > >>understand
> > >> > about your event payload characteristics, the better you can tune
> the
> > >> > system. The two knobs to really turn are the log.flush.interval and
> > >> > log.default.flush.interval.ms. The values here are intrinsically
> > >> > connected to the types of payloads you are putting through the
> system.
> > >> > >> Fourth and finally, to maximize throughput you have to code
> against
> > >> the
> > >> > async paradigm, and be prepared to tweak the batch size, queue
> > >> properties,
> > >> > and compression codec (wait for itŠ) in a way that matches the
> message
> > >> > payload you are putting through the system and the capabilities of
> the
> > >> > producer system itself.
> > >> > >>
> > >> > >>
> > >> > >> Jason
> > >> > >>
> > >> > >>
> > >> > >>
> > >> > >>
> > >> > >>
> > >> > >> This electronic message contains information which may be
> > >>confidential
> > >> > or privileged. The information is intended for the use of the
> > >>individual
> > >> or
> > >> > entity named above. If you are not the intended recipient, be aware
> > >>that
> > >> > any disclosure, copying, distribution or use of the contents of this
> > >> > information is prohibited. If you have received this electronic
> > >> > transmission in error, please notify us by e-mail at (
> > >> > postmas...@rapid7.com) immediately.
> > >> > >
> > >> > > --------------------------
> > >> > > Ken Krugler
> > >> > > +1 530-210-6378
> > >> > > http://www.scaleunlimited.com
> > >> > > custom big data solutions & training
> > >> > > Hadoop, Cascading, Cassandra & Solr
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > > This electronic message contains information which may be
> > >>confidential
> > >> > or privileged. The information is intended for the use of the
> > >>individual
> > >> or
> > >> > entity named above. If you are not the intended recipient, be aware
> > >>that
> > >> > any disclosure, copying, distribution or use of the contents of this
> > >> > information is prohibited. If you have received this electronic
> > >> > transmission in error, please notify us by e-mail at (
> > >> > postmas...@rapid7.com) immediately.
> > >> > >
> > >> >
> > >> > --------------------------
> > >> > Ken Krugler
> > >> > +1 530-210-6378
> > >> > http://www.scaleunlimited.com
> > >> > custom big data solutions & training
> > >> > Hadoop, Cascading, Cassandra & Solr
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> > This electronic message contains information which may be
> > >>confidential or
> > >> > privileged. The information is intended for the use of the
> individual
> > >>or
> > >> > entity named above. If you are not the intended recipient, be aware
> > >>that
> > >> > any disclosure, copying, distribution or use of the contents of this
> > >> > information is prohibited. If you have received this electronic
> > >> > transmission in error, please notify us by e-mail at (
> > >> > postmas...@rapid7.com) immediately.
> > >> >
> > >> >
> > >> This electronic message contains information which may be confidential
> > >>or
> > >> privileged. The information is intended for the use of the individual
> or
> > >> entity named above. If you are not the intended recipient, be aware
> that
> > >> any disclosure, copying, distribution or use of the contents of this
> > >> information is prohibited. If you have received this electronic
> > >> transmission in error, please notify us by e-mail at (
> > >> postmas...@rapid7.com) immediately.
> > >>
> >
> > This electronic message contains information which may be confidential or
> > privileged. The information is intended for the use of the individual or
> > entity named above. If you are not the intended recipient, be aware that
> > any disclosure, copying, distribution or use of the contents of this
> > information is prohibited. If you have received this electronic
> > transmission in error, please notify us by e-mail at (
> > postmas...@rapid7.com) immediately.
> >
> >
> This electronic message contains information which may be confidential or
> privileged. The information is intended for the use of the individual or
> entity named above. If you are not the intended recipient, be aware that
> any disclosure, copying, distribution or use of the contents of this
> information is prohibited. If you have received this electronic
> transmission in error, please notify us by e-mail at (
> postmas...@rapid7.com) immediately.
>
>

Reply via email to