Thanks. FWIW this one has been fine so far java version "1.7.0_13" OpenJDK Runtime Environment (IcedTea7 2.3.6) (Ubuntu build 1.7.0_13-b20) OpenJDK 64-Bit Server VM (build 23.7-b01, mixed mode)
though not running at the load in your tests. On Wed, May 22, 2013 at 4:51 PM, Jason Weiss <jason_we...@rapid7.com> wrote: > [ec2-user@ip-10-194-5-76 ~]$ java -version > java version "1.6.0_24" > OpenJDK Runtime Environment (IcedTea6 1.11.11) > (amazon-61.1.11.11.53.amzn1-x86_64) > OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode) > > > Yes, as soon as I put it under heavy load, it would buckle almost > consistently. I knew it was JDK related because I temporarily gave up on > AWS, but I was able to run the same code on my MacBook Pro without issue. > That's when I upgraded AWS to Oracle Java 7 64-bit and all my crashes > disappeared under load. > > Jason > > > ________________________________________ > From: Scott Clasen [sc...@heroku.com] > Sent: Wednesday, May 22, 2013 19:27 > To: users > Subject: Re: Apache Kafka in AWS > > Hey Jason, > > question what openjdk version did you have issues with? Im running kafka > on it now and has been ok. Was it a crash only at load? > > Thanks > SC > > > On Wed, May 22, 2013 at 1:42 PM, Jason Weiss <jason_we...@rapid7.com> > wrote: > > > All, > > > > I asked a number of questions of the group over the last week, and I'm > > happy to report that I've had great success getting Kafka up and running > in > > AWS. I am using 3 EC2 instances, each of which is a M2 High-Memory > > Quadruple Extra Large with 8 cores and 58.4 GiB of memory according to > the > > AWS specs. I have co-located Zookeeper instances next to Zafka on each > > machine. > > > > I am able to publish in a repeatable fashion 273,000 events per second, > > with each event payload consisting of a fixed size of 2048 bytes! This > > represents the maximum throughput possible on this configuration, as the > > servers became CPU constrained, averaging 97% utilization in a relatively > > flat line. This isn't a "burst" speed – it represents a sustained > > throughput from 20 M1 Large EC2 Kafka multi-threaded producers. Putting > > this into perspective, if my log retention period was a month, I'd be > > aggregating 1.3 petabytes of data on my disk drives. Suffice to say, I > > don't see us retaining data for more than a few hours! > > > > Here were the keys to tuning for future folks to consider: > > > > First and foremost, be sure to configure your Java heap size accordingly > > when you launch Kafka. The default is like 512MB, which in my case left > > virtually all of my RAM inaccessible to Kafka. > > Second, stay away from OpenJDK. No, seriously – this was a huge thorn in > > my side, and I almost gave up on Kafka because of the problems I > > encountered. The OpenJDK NIO functions repeatedly resulted in Kafka > > crashing and burning in dramatic fashion. The moment I switched over to > > Oracle's JDK for linux, Kafka didn't puke once- I mean, like not even a > > hiccup. > > Third know your message size. In my opinion, the more you understand > about > > your event payload characteristics, the better you can tune the system. > The > > two knobs to really turn are the log.flush.interval and > > log.default.flush.interval.ms. The values here are intrinsically > > connected to the types of payloads you are putting through the system. > > Fourth and finally, to maximize throughput you have to code against the > > async paradigm, and be prepared to tweak the batch size, queue > properties, > > and compression codec (wait for it…) in a way that matches the message > > payload you are putting through the system and the capabilities of the > > producer system itself. > > > > > > Jason > > > > > > > > > > > > This electronic message contains information which may be confidential or > > privileged. The information is intended for the use of the individual or > > entity named above. If you are not the intended recipient, be aware that > > any disclosure, copying, distribution or use of the contents of this > > information is prohibited. If you have received this electronic > > transmission in error, please notify us by e-mail at ( > > postmas...@rapid7.com) immediately. > > > This electronic message contains information which may be confidential or > privileged. The information is intended for the use of the individual or > entity named above. If you are not the intended recipient, be aware that > any disclosure, copying, distribution or use of the contents of this > information is prohibited. If you have received this electronic > transmission in error, please notify us by e-mail at ( > postmas...@rapid7.com) immediately. > >