I think currently we do a little over 200 billion events per day at LinkedIn, though we are not actually the largest Kafka user any more.
On the whole scaling the volume of messages is actually not that hard in Kafka. Data is partitioned, and partitions don't really communicate with each other, so adding more machines will add more capacity, there really aren't a ton of gotchas. The operations section of the wiki has some tips on performance tuning. I recommend using the performance test commands described in the link from this test to try out some stuff on your gear and get a feeling for how much hardware you need: http://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines -Jay On Thu, Jun 26, 2014 at 1:29 PM, Zack Payton <zpay...@gmail.com> wrote: > Hi there, > > There have been some internal debates here about how far we can scale > Kafka. Ideally, we'd be able to make it scale to 90 billion events a day. > I've seen somewhere that linked scaled it up to 40 billion events a day. > Has anyone seen a hard plateau in terms of scalability? Does anyone have > any advice for tweaking configs to achieve ultra-high performance? > > Thanks, > Z >