Hi, Alex, Thanks for sharing your thoughts here. Your understand of Kafka is correct and your analysis is very helpful. Just a couple of followup questions on RabbitMQ.
1. How do people typically scale out RabbitMQ? In Kafka, we have this notion of a cluster that can include multiple brokers. Brokers in a cluster are managed through Zookeeper. 2. How does RabbitMQ support HA? In the upcoming Kafka 0.8 release, we are adding the capability of storing a message redundantly in multiple brokers in a cluster for achieving both higher durability and availability. I am wondering if RabbitMQ is doing something similar already. Jun On Fri, Jun 7, 2013 at 5:54 AM, Alexis Richardson <ale...@rabbitmq.com>wrote: > Hi > > Alexis from Rabbit here. I hope I am not intruding! > > It would be super helpful if people with questions, observations or > moans posted them to the rabbitmq list too :-) > > A few comments: > > * Along with ZeroMQ, I consider Kafka to be one of the interesting and > useful messaging projects out there. In a world of cruft, Kafka is > cool! > > * This is because both projects come at messaging from a specific > point of view that is *different* from Rabbit. OTOH, many other > projects exist that replicate Rabbit features for fun, or NIH, or due > to misunderstanding the semantics (yes, our docs could be better) > > * It is striking how few people describe those differences. In a > nutshell they are as follows: > > *** Kafka writes all incoming data to disk immediately, and then > figures out who sees what. So it is much more like a database than > Rabbit, in that new consumers can appear well after the disk write and > still subscribe to past messages. Instead, Rabbit which tries to > deliver to consumers and buffers otherwise. Persistence is optional > but robust and a feature of the buffer ("queue") not the upstream > machinery. Rabbit is able to cache-on-arrival via a plugin, but this > is a design overlay and not particularly optimal. > > *** Kafka is a client server system with end to end semantics. It > defines order to include processing order, and keeps state on the > client to do this. Group management is via a 3rd party service > (Zookeeper? I forget which). Rabbit is a server-only protocol based > system which maintains order on the server and through completely > language neutral protocol semantics. This makes Rabbit perhaps more > natural as a 'messaging service' eg for integration and other > inter-app data transfer. > > *** Rabbit is a general purpose messaging system with extras like > federation. It speaks many protocols, and has core features like HA, > transactions, management, etc. Everything can be switched on or off. > Getting all this to work while keeping the install light and fast, is > quite fiddly. Kafka by contrast comes from a specific set of use > cases, which are interesting certainly. I am not sure if Kafka wants > to be a general purpose messaging system, but it will become a bit > more like Rabbit if that is the goal. > > *** Both approaches have costs. In the case of Rabbit the cost is > that more metadata is stored on the broker. Kafka can get performance > gains by storing less such data. But we are talking about some N > thousands of MPS versus some M thousands. At those speeds the clients > are usually the bottleneck anyway. > > * Let me also clarify some things: > > *** Rabbit does NOT store multiple copies of the same message across > queues, unless they are very small (<60b, iirc). A message delivered > to >1 queue on 1 machine is stored once. Metadata about that message > may be stored more than once, but, at scale, the big cost is the > payload. > > *** Rabbit's vanilla install does store some index data in memory when > messages flow to disk. You can change this by using a plugin, but > this is a secret-menu undocumented feature. Very very few people need > any such thing. > > *** A Rabbit queue is lightweight. It's just an ordered consumption > buffer that can persist and ack. Don't assume things about Rabbit > queues based on what you know about IBM MQ, JMS, and so forth. Queues > in Rabbit and Kafka are not the same. > > *** Rabbit does not use mnesia for message storage. It has its own > DB, optimised for messaging. You can use other DBs but this is > Complicated. > > *** Rabbit does all kinds of batching and bulk processing, and can > batch end to end. If you see claims about batching, buffering, etc., > find out ALL the details before drawing conclusions. > > I hope this is helpful. > > Keen to get feedback / questions / corrections. > > alexis > > > > > > > > On Fri, Jun 7, 2013 at 2:09 AM, Marc Labbe <mrla...@gmail.com> wrote: > > We also went through the same decision making and our arguments for Kafka > > where in the same lines as those Jonathan mentioned. The fact that we > have > > heterogeneous consumers is really a deciding factor. Our requirements > were > > to avoid loosing messages at all cost while having multiple consumers > > reading the same data at a different pace. On one side, we have a few > > consumers being fed with data coming in from most, if not all, topics. On > > the other side, we have a good bunch of consumers reading only from a > > single topic. The big guys can take their time to read while the smaller > > ones are mostly for near real-time events so they need to keep up the > pace > > of incoming messages. > > > > RabbitMQ stores data on disk only if you tell it to while Kafka persists > by > > design. From the beginning, we decided we would try to use the queues the > > same way, pub/sub with a routing key (an exchange in RabbitMQ) or topic, > > persisted to disk and replicated. > > > > One of our scenario was to see how the system would cope with the largest > > consumer down for a while, therefore forcing the brokers to keep the data > > for a long period. In the case of RabbitMQ, this consumer has it owns > queue > > and data grows on disk, which is not really a problem if you plan > > consequently. But, since it has to keep track of all messages read, the > > Mnesia database used by RabbitMQ as the messages index also grows pretty > > big. At that point, the amount of RAM necessary becomes very large to > keep > > the level of performance we need. In our tests, we found that this an > > adverse effect on ALL the brokers, thus affecting all consumers. You can > > always say that you'll monitor the consumers to make sure it won't > happen. > > That's a good thing if you can. I wasn't ready to make that bet. > > > > Another point is the fact that, since we wanted to use pub/sub with a > > exchange in RabbitMQ, we would have ended up with a lot data duplication > > because if a message is read by multiple consumers, it will get > duplicated > > in the queue of each of those consumer. Kafka wins on that side too since > > every consumer reads from the same source. > > > > The downsides of Kafka were the language issues (we are using mostly > Python > > and C#). 0.8 is very new and few drivers are available at this point. > Also, > > we will have to try getting as close as possible to once-and-only-once > > guarantee. There are two things where RabbitMQ would have given us less > > work out of the box as opposed to Kafka. RabbitMQ also provides a bunch > of > > tools that makes it rather attractive too. > > > > In the end, looking at throughput is a pretty nifty thing but being sure > > that I'll be able to manage the beast as it grows will allow me to get to > > sleep way more easily. > > > > > > On Thu, Jun 6, 2013 at 3:28 PM, Jonathan Hodges <hodg...@gmail.com> > wrote: > > > >> We just went through a similar exercise with RabbitMQ at our company > with > >> streaming activity data from our various web properties. Our use case > >> requires consumption of this stream by many heterogeneous consumers > >> including batch (Hadoop) and real-time (Storm). We pointed out that > Kafka > >> acts as a configurable rolling window of time on the activity stream. > The > >> window default is 7 days which allows for supporting clients of > different > >> latencies like Hadoop and Storm to read from the same stream. > >> > >> We pointed out that the Kafka brokers don't need to maintain consumer > state > >> in the stream and only have to maintain one copy of the stream to > support N > >> number of consumers. Rabbit brokers on the other hand have to maintain > the > >> state of each consumer as well as create a copy of the stream for each > >> consumer. In our scenario we have 10-20 consumers and with the scale > and > >> throughput of the activity stream we were able to show Rabbit quickly > >> becomes the bottleneck under load. > >> > >> > >> > >> On Thu, Jun 6, 2013 at 12:40 PM, Dragos Manolescu < > >> dragos.manole...@servicenow.com> wrote: > >> > >> > Hi -- > >> > > >> > I am preparing to make a case for using Kafka instead of Rabbit MQ as > a > >> > broker-based messaging provider. The context is similar to that of the > >> > Kafka papers and user stories: the producers publish monitoring data > and > >> > logs, and a suite of subscribers consume this data (some store it, > others > >> > perform computations on the event stream). The requirements are > typical > >> of > >> > this context: low-latency, high-throughput, ability to deal with > bursts > >> and > >> > operate in/across multiple data centers, etc. > >> > > >> > I am familiar with the performance comparison between Kafka, Rabbit MQ > >> and > >> > Active MQ from the NetDB 2011 paper< > >> > > >> > http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf > >> >. > >> > However in the two years that passed since then the number of > production > >> > Kafka installations increased, and people are using it in different > ways > >> > than those imagined by Kafka's designers. In light of these > experiences > >> one > >> > can use more data points and color when contrasting to Rabbit MQ > (which > >> by > >> > the way also evolved since 2011). (And FWIW I know I am not the first > one > >> > to walk this path; see for example last year's OSCON session on the > State > >> > of MQ<http://lanyrd.com/2012/oscon/swrcz/>.) > >> > > >> > I would appreciate it if you could share measurements, results, or > even > >> > anecdotal evidence along these lines. How have you avoided the "let's > use > >> > Rabbit MQ because everybody else does it" route when solving problems > for > >> > which Kafka is a better fit? > >> > > >> > Thanks, > >> > > >> > -Dragos > >> > > >> >