We also went through the same decision making and our arguments for Kafka where in the same lines as those Jonathan mentioned. The fact that we have heterogeneous consumers is really a deciding factor. Our requirements were to avoid loosing messages at all cost while having multiple consumers reading the same data at a different pace. On one side, we have a few consumers being fed with data coming in from most, if not all, topics. On the other side, we have a good bunch of consumers reading only from a single topic. The big guys can take their time to read while the smaller ones are mostly for near real-time events so they need to keep up the pace of incoming messages.
RabbitMQ stores data on disk only if you tell it to while Kafka persists by design. From the beginning, we decided we would try to use the queues the same way, pub/sub with a routing key (an exchange in RabbitMQ) or topic, persisted to disk and replicated. One of our scenario was to see how the system would cope with the largest consumer down for a while, therefore forcing the brokers to keep the data for a long period. In the case of RabbitMQ, this consumer has it owns queue and data grows on disk, which is not really a problem if you plan consequently. But, since it has to keep track of all messages read, the Mnesia database used by RabbitMQ as the messages index also grows pretty big. At that point, the amount of RAM necessary becomes very large to keep the level of performance we need. In our tests, we found that this an adverse effect on ALL the brokers, thus affecting all consumers. You can always say that you'll monitor the consumers to make sure it won't happen. That's a good thing if you can. I wasn't ready to make that bet. Another point is the fact that, since we wanted to use pub/sub with a exchange in RabbitMQ, we would have ended up with a lot data duplication because if a message is read by multiple consumers, it will get duplicated in the queue of each of those consumer. Kafka wins on that side too since every consumer reads from the same source. The downsides of Kafka were the language issues (we are using mostly Python and C#). 0.8 is very new and few drivers are available at this point. Also, we will have to try getting as close as possible to once-and-only-once guarantee. There are two things where RabbitMQ would have given us less work out of the box as opposed to Kafka. RabbitMQ also provides a bunch of tools that makes it rather attractive too. In the end, looking at throughput is a pretty nifty thing but being sure that I'll be able to manage the beast as it grows will allow me to get to sleep way more easily. On Thu, Jun 6, 2013 at 3:28 PM, Jonathan Hodges <hodg...@gmail.com> wrote: > We just went through a similar exercise with RabbitMQ at our company with > streaming activity data from our various web properties. Our use case > requires consumption of this stream by many heterogeneous consumers > including batch (Hadoop) and real-time (Storm). We pointed out that Kafka > acts as a configurable rolling window of time on the activity stream. The > window default is 7 days which allows for supporting clients of different > latencies like Hadoop and Storm to read from the same stream. > > We pointed out that the Kafka brokers don't need to maintain consumer state > in the stream and only have to maintain one copy of the stream to support N > number of consumers. Rabbit brokers on the other hand have to maintain the > state of each consumer as well as create a copy of the stream for each > consumer. In our scenario we have 10-20 consumers and with the scale and > throughput of the activity stream we were able to show Rabbit quickly > becomes the bottleneck under load. > > > > On Thu, Jun 6, 2013 at 12:40 PM, Dragos Manolescu < > dragos.manole...@servicenow.com> wrote: > > > Hi -- > > > > I am preparing to make a case for using Kafka instead of Rabbit MQ as a > > broker-based messaging provider. The context is similar to that of the > > Kafka papers and user stories: the producers publish monitoring data and > > logs, and a suite of subscribers consume this data (some store it, others > > perform computations on the event stream). The requirements are typical > of > > this context: low-latency, high-throughput, ability to deal with bursts > and > > operate in/across multiple data centers, etc. > > > > I am familiar with the performance comparison between Kafka, Rabbit MQ > and > > Active MQ from the NetDB 2011 paper< > > > http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf > >. > > However in the two years that passed since then the number of production > > Kafka installations increased, and people are using it in different ways > > than those imagined by Kafka's designers. In light of these experiences > one > > can use more data points and color when contrasting to Rabbit MQ (which > by > > the way also evolved since 2011). (And FWIW I know I am not the first one > > to walk this path; see for example last year's OSCON session on the State > > of MQ<http://lanyrd.com/2012/oscon/swrcz/>.) > > > > I would appreciate it if you could share measurements, results, or even > > anecdotal evidence along these lines. How have you avoided the "let's use > > Rabbit MQ because everybody else does it" route when solving problems for > > which Kafka is a better fit? > > > > Thanks, > > > > -Dragos > > >