Piotr, Thanks for your posts and after your comments about the need of "spooling", I just find out this link:
http://grokbase.com/t/kafka/dev/133939nbvg/jira-commented-kafka-156-messages-should-not-be-dropped-when-brokers-are-unavailable You are right that we need to do the spooling system by ourself until the above issue is fixed. I wonder if everybody is doing this spooling system by themselves? As I just pushed kafka to our production system for "non-critical" purposes, but eventually I want to make sure kafka will not lose message, any idea to share to do this spooling system? Wing On Thu, Apr 11, 2013 at 4:10 AM, Piotr Kozikowski <pi...@liveramp.com>wrote: > Otis, > > That's actually a question we are trying to answer. In our current > production system, Scribe does spooling to local disk, so each producer > node becomes a local broker until the actual brokers are able to receive > all messages again. It looks like unless a similar feature is added to > Kafka we will have to come up with our own spooling system. > > -Piotr > > On Wed, Apr 10, 2013 at 12:04 PM, Otis Gospodnetic < > otis_gospodne...@yahoo.com> wrote: > > > Hi, > > > > Is there anything one can do to "defend" from: > > > > "Trying to push more data than the brokers can handle for any sustained > > period of time has catastrophic consequences, regardless of what timeout > > settings are used. In our use case this means that we need to either > ensure > > we have spare capacity for spikes, or use something on top of Kafka to > > absorb spikes." > > > > ? > > Thanks, > > Otis > > ---- > > Performance Monitoring for Solr / ElasticSearch / HBase - > > http://sematext.com/spm > > > > > > > > > > > > >________________________________ > > > From: Piotr Kozikowski <pi...@liveramp.com> > > >To: users@kafka.apache.org > > >Sent: Tuesday, April 9, 2013 1:23 PM > > >Subject: Re: Analysis of producer performance > > > > > >Jun, > > > > > >Thank you for your comments. I'll reply point by point for clarity. > > > > > >1. We were aware of the migration tool but since we haven't used Kafka > for > > >production yet we just started using the 0.8 version directly. > > > > > >2. I hadn't seen those particular slides, very interesting. I'm not sure > > >we're testing the same thing though. In our case we vary the number of > > >physical machines, but each one has 10 threads accessing a pool of Kafka > > >producer objects and in theory a single machine is enough to saturate > the > > >brokers (which our test mostly confirms). Also, assuming that the slides > > >are based on the built-in producer performance tool, I know that we > > started > > >getting very different numbers once we switched to use "real" (actual > > >production log) messages. Compression may also be a factor in case it > > >wasn't configured the same way in those tests. > > > > > >3. In the latency section, there are two tests, one for average and > > another > > >for maximum latency. Each one has two graphs presenting the exact same > > data > > >but at different levels of zoom. The first one is to observe small > > >variations of latency when target throughput <= actual throughput. The > > >second is to observe the overall shape of the graph once latency starts > > >growing when target throughput > actual throughput. I hope that makes > > sense. > > > > > >4. That sounds great, looking forward to it. > > > > > >Piotr > > > > > >On Mon, Apr 8, 2013 at 9:48 PM, Jun Rao <jun...@gmail.com> wrote: > > > > > >> Piotr, > > >> > > >> Thanks for sharing this. Very interesting and useful study. A few > > comments: > > >> > > >> 1. For existing 0.7 users, we have a migration tool that mirrors data > > from > > >> an 0.7 cluster to an 0.8 cluster. Applications can upgrade to 0.8 by > > >> upgrading consumers first, followed by producers. > > >> > > >> 2. Have you looked at the Kafka ApacheCon slides ( > > >> http://www.slideshare.net/junrao/kafka-replication-apachecon2013)? > > Towards > > >> the end, there are some performance numbers too. The figure for > > throughput > > >> vs #producer is different from what you have. Not sure if this is > > because > > >> that you have turned on compression. > > >> > > >> 3. Not sure that I understand the difference btw the first 2 graphs in > > the > > >> latency section. What's different btw the 2 tests? > > >> > > >> 4. Post 0.8, we plan to improve the producer side throughput by > > >> implementing non-blocking socket on the client side. > > >> > > >> Jun > > >> > > >> > > >> On Mon, Apr 8, 2013 at 4:42 PM, Piotr Kozikowski <pi...@liveramp.com> > > >> wrote: > > >> > > >> > Hi, > > >> > > > >> > At LiveRamp we are considering replacing Scribe with Kafka, and as a > > >> first > > >> > step we run some tests to evaluate producer performance. You can > find > > our > > >> > preliminary results here: > > >> > > > https://blog.liveramp.com/2013/04/08/kafka-0-8-producer-performance-2/. > > >> We > > >> > hope this will be useful for some folks, and If anyone has comments > or > > >> > suggestions about what to do differently to obtain better results > your > > >> > feedback will be very welcome. > > >> > > > >> > Thanks, > > >> > > > >> > Piotr > > >> > > > >> > > > > > > > > > > > >