Re: Arguments for Kafka over RabbitMQ ?

Alexis Richardson Fri, 07 Jun 2013 15:42:34 -0700

Jonathan,


On Fri, Jun 7, 2013 at 7:03 PM, Jonathan Hodges <hodg...@gmail.com> wrote:
> Hi Alexis,
>
> I appreciate your reply and clarifications to my misconception about
> Rabbit, particularly on the copying of the message payloads per consumer.

Thank-you!


>  It sounds like it only copies metadata like the consumer state i.e.
> position in the topic messages.

Basically yes.  Of course when a message is delivered to N>1
*machines*, then there will be N copies, one per machine.

Also, for various reasons, very tiny (<60b) messages do get copied as
you'd assumed.


> I don’t have experience with Rabbit and
> was basing this assumption based on Google searches like the following -
> http://ilearnstack.com/2013/04/16/introduction-to-amqp-messaging-with-rabbitmq/.
>  It seems to indicate with topic exchanges that the messages get copied to
> a queue per consumer, but I am glad you confirmed it is just the metadata.

Yup.

That's a fairly decent article but even the good stuff uses words like
"copy" without a fixed denotation.  Don't believe the internets!


> While you are correct the payload is a much bigger concern, managing the
> metadata and acks centrally on the broker across multiple clients at scale
> is also a concern.  This would seem to be exasperated if you have consumers
> at different speeds i.e. Storm and Hadoop consuming the same topic.
>
> In that scenario, say storm consumes the topic messages in real-time and
> Hadoop consumes once a day.  Let’s assume the topic consists of 100k+
> messages/sec throughput so that in a given day you might have 100s GBs of
> data flowing through the topic.
>
> To allow Hadoop to consume once a day, Rabbit obviously can’t keep 100s GBs
> in memory and will need to persist this data to its internal DB to be
> retrieved later.

I am not sure why you think this is a problem?

For a fixed number of producers and consumers, the pubsub and delivery
semantics of Rabbit and Kafka are quite similar.  Think of Rabbit as
adding an in-memory cache that is used to (a) speed up read
consumption, (b) obviate disk writes when possible due to all client
consumers being available and consuming.


> I believe when large amounts of data need to be persisted
> is the scenario described in the earlier posted Kafka paper (
> http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf)
> where Rabbit’s performance really starts to bog down as compared to Kafka.

Not sure what parts of the paper you mean?

I read that paper when it came out.  I found it strongest when
describing Kafka's design philosophy.  I found the performance
statements made about Rabbit pretty hard to understand.  This is not
meant to be a criticism of the authors!  I have seen very few
performance papers about messaging that I would base decisions on.


> This Kafka paper is looks to be a few years old

Um....  Lots can change in technology very quickly :-)

Eg.: At the time this paper was published, Instagram had 5m users.
Six months earlier in Dec 2010, it had 1m.  Since then it grew huge
and got acquired.



> so has something changed
> within the Rabbit architecture to alleviate this issue when large amounts
> of data are persisted to the internal DB?

Rabbit introduced a new internal flow control system which impacted
performance under steady load.  This may be relevant?  I couldn't say
from reading the paper.

I don't have a good reference for this to hand, but here is a post
about external flow control that you may find amusing:
http://www.rabbitmq.com/blog/2012/05/11/some-queuing-theory-throughput-latency-and-bandwidth/


> Do the producer and consumer
> numbers look correct?  If no, maybe you can share some Rabbit benchmarks
> under this scenario, because I believe it is the main area where Kafka
> appears to be the superior solution.

This is from about one year ago:
http://www.rabbitmq.com/blog/2012/04/25/rabbitmq-performance-measurements-part-2/

Obviously none of this uses batching, which is an easy trick for
increasing throughput.

YMMV.

Is this helping?

alexis



> Thanks for educating me on these matters.
>
> -Jonathan
>
>
>
> On Fri, Jun 7, 2013 at 6:54 AM, Alexis Richardson <ale...@rabbitmq.com>wrote:
>
>> Hi
>>
>> Alexis from Rabbit here.  I hope I am not intruding!
>>
>> It would be super helpful if people with questions, observations or
>> moans posted them to the rabbitmq list too :-)
>>
>> A few comments:
>>
>> * Along with ZeroMQ, I consider Kafka to be one of the interesting and
>> useful messaging projects out there.  In a world of cruft, Kafka is
>> cool!
>>
>> * This is because both projects come at messaging from a specific
>> point of view that is *different* from Rabbit.  OTOH, many other
>> projects exist that replicate Rabbit features for fun, or NIH, or due
>> to misunderstanding the semantics (yes, our docs could be better)
>>
>> * It is striking how few people describe those differences.  In a
>> nutshell they are as follows:
>>
>> *** Kafka writes all incoming data to disk immediately, and then
>> figures out who sees what.  So it is much more like a database than
>> Rabbit, in that new consumers can appear well after the disk write and
>> still subscribe to past messages.  Instead, Rabbit which tries to
>> deliver to consumers and buffers otherwise.  Persistence is optional
>> but robust and a feature of the buffer ("queue") not the upstream
>> machinery.  Rabbit is able to cache-on-arrival via a plugin, but this
>> is a design overlay and not particularly optimal.
>>
>> *** Kafka is a client server system with end to end semantics.  It
>> defines order to include processing order, and keeps state on the
>> client to do this.  Group management is via a 3rd party service
>> (Zookeeper? I forget which).  Rabbit is a server-only protocol based
>> system which maintains order on the server and through completely
>> language neutral protocol semantics.  This makes Rabbit perhaps more
>> natural as a 'messaging service' eg for integration and other
>> inter-app data transfer.
>>
>> *** Rabbit is a general purpose messaging system with extras like
>> federation.  It speaks many protocols, and has core features like HA,
>> transactions, management, etc.  Everything can be switched on or off.
>> Getting all this to work while keeping the install light and fast, is
>> quite fiddly.  Kafka by contrast comes from a specific set of use
>> cases, which are interesting certainly.  I am not sure if Kafka wants
>> to be a general purpose messaging system, but it will become a bit
>> more like Rabbit if that is the goal.
>>
>> *** Both approaches have costs.  In the case of Rabbit the cost is
>> that more metadata is stored on the broker.  Kafka can get performance
>> gains by storing less such data.  But we are talking about some N
>> thousands of MPS versus some M thousands.  At those speeds the clients
>> are usually the bottleneck anyway.
>>
>> * Let me also clarify some things:
>>
>> *** Rabbit does NOT store multiple copies of the same message across
>> queues, unless they are very small (<60b, iirc).  A message delivered
>> to >1 queue on 1 machine is stored once.  Metadata about that message
>> may be stored more than once, but, at scale, the big cost is the
>> payload.
>>
>> *** Rabbit's vanilla install does store some index data in memory when
>> messages flow to disk.  You can change this by using a plugin, but
>> this is a secret-menu undocumented feature.  Very very few people need
>> any such thing.
>>
>> *** A Rabbit queue is lightweight.  It's just an ordered consumption
>> buffer that can persist and ack.  Don't assume things about Rabbit
>> queues based on what you know about IBM MQ, JMS, and so forth.  Queues
>> in Rabbit and Kafka are not the same.
>>
>> *** Rabbit does not use mnesia for message storage.  It has its own
>> DB, optimised for messaging.  You can use other DBs but this is
>> Complicated.
>>
>> *** Rabbit does all kinds of batching and bulk processing, and can
>> batch end to end.  If you see claims about batching, buffering, etc.,
>> find out ALL the details before drawing conclusions.
>>
>> I hope this is helpful.
>>
>> Keen to get feedback / questions / corrections.
>>
>> alexis
>>
>>
>>
>>
>>
>>
>>
>> On Fri, Jun 7, 2013 at 2:09 AM, Marc Labbe <mrla...@gmail.com> wrote:
>> > We also went through the same decision making and our arguments for Kafka
>> > where in the same lines as those Jonathan mentioned. The fact that we
>> have
>> > heterogeneous consumers is really a deciding factor. Our requirements
>> were
>> > to avoid loosing messages at all cost while having multiple consumers
>> > reading the same data at a different pace. On one side, we have a few
>> > consumers being fed with data coming in from most, if not all, topics. On
>> > the other side, we have a good bunch of consumers reading only from a
>> > single topic. The big guys can take their time to read while the smaller
>> > ones are mostly for near real-time events so they need to keep up the
>> pace
>> > of incoming messages.
>> >
>> > RabbitMQ stores data on disk only if you tell it to while Kafka persists
>> by
>> > design. From the beginning, we decided we would try to use the queues the
>> > same way, pub/sub with a routing key (an exchange in RabbitMQ) or topic,
>> > persisted to disk and replicated.
>> >
>> > One of our scenario was to see how the system would cope with the largest
>> > consumer down for a while, therefore forcing the brokers to keep the data
>> > for a long period. In the case of RabbitMQ, this consumer has it owns
>> queue
>> > and data grows on disk, which is not really a problem if you plan
>> > consequently. But, since it has to keep track of all messages read, the
>> > Mnesia database used by RabbitMQ as the messages index also grows pretty
>> > big. At that point, the amount of RAM necessary becomes very large to
>> keep
>> > the level of performance we need. In our tests, we found that this an
>> > adverse effect on ALL the brokers, thus affecting all consumers. You can
>> > always say that you'll monitor the consumers to make sure it won't
>> happen.
>> > That's a good thing if you can. I wasn't ready to make that bet.
>> >
>> > Another point is the fact that, since we wanted to use pub/sub with a
>> > exchange in RabbitMQ, we would have ended up with a lot data duplication
>> > because if a message is read by multiple consumers, it will get
>> duplicated
>> > in the queue of each of those consumer. Kafka wins on that side too since
>> > every consumer reads from the same source.
>> >
>> > The downsides of Kafka were the language issues (we are using mostly
>> Python
>> > and C#). 0.8 is very new and few drivers are available at this point.
>> Also,
>> > we will have to try getting as close as possible to once-and-only-once
>> > guarantee. There are two things where RabbitMQ would have given us less
>> > work out of the box as opposed to Kafka. RabbitMQ also provides a bunch
>> of
>> > tools that makes it rather attractive too.
>> >
>> > In the end, looking at throughput is a pretty nifty thing but being sure
>> > that I'll be able to manage the beast as it grows will allow me to get to
>> > sleep way more easily.
>> >
>> >
>> > On Thu, Jun 6, 2013 at 3:28 PM, Jonathan Hodges <hodg...@gmail.com>
>> wrote:
>> >
>> >> We just went through a similar exercise with RabbitMQ at our company
>> with
>> >> streaming activity data from our various web properties.  Our use case
>> >> requires consumption of this stream by many heterogeneous consumers
>> >> including batch (Hadoop) and real-time (Storm).  We pointed out that
>> Kafka
>> >> acts as a configurable rolling window of time on the activity stream.
>>  The
>> >> window default is 7 days which allows for supporting clients of
>> different
>> >> latencies like Hadoop and Storm to read from the same stream.
>> >>
>> >> We pointed out that the Kafka brokers don't need to maintain consumer
>> state
>> >> in the stream and only have to maintain one copy of the stream to
>> support N
>> >> number of consumers.  Rabbit brokers on the other hand have to maintain
>> the
>> >> state of each consumer as well as create a copy of the stream for each
>> >> consumer.  In our scenario we have 10-20 consumers and with the scale
>> and
>> >> throughput of the activity stream we were able to show Rabbit quickly
>> >> becomes the bottleneck under load.
>> >>
>> >>
>> >>
>> >> On Thu, Jun 6, 2013 at 12:40 PM, Dragos Manolescu <
>> >> dragos.manole...@servicenow.com> wrote:
>> >>
>> >> > Hi --
>> >> >
>> >> > I am preparing to make a case for using Kafka instead of Rabbit MQ as
>> a
>> >> > broker-based messaging provider. The context is similar to that of the
>> >> > Kafka papers and user stories: the producers publish monitoring data
>> and
>> >> > logs, and a suite of subscribers consume this data (some store it,
>> others
>> >> > perform computations on the event stream). The requirements are
>> typical
>> >> of
>> >> > this context: low-latency, high-throughput, ability to deal with
>> bursts
>> >> and
>> >> > operate in/across multiple data centers, etc.
>> >> >
>> >> > I am familiar with the performance comparison between Kafka, Rabbit MQ
>> >> and
>> >> > Active MQ from the NetDB 2011 paper<
>> >> >
>> >>
>> http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf
>> >> >.
>> >> > However in the two years that passed since then the number of
>> production
>> >> > Kafka installations increased, and people are using it in different
>> ways
>> >> > than those imagined by Kafka's designers. In light of these
>> experiences
>> >> one
>> >> > can use more data points and color when contrasting to Rabbit MQ
>> (which
>> >> by
>> >> > the way also evolved since 2011). (And FWIW I know I am not the first
>> one
>> >> > to walk this path; see for example last year's OSCON session on the
>> State
>> >> > of MQ<http://lanyrd.com/2012/oscon/swrcz/>.)
>> >> >
>> >> > I would appreciate it if you could share measurements, results, or
>> even
>> >> > anecdotal evidence along these lines. How have you avoided the "let's
>> use
>> >> > Rabbit MQ because everybody else does it" route when solving problems
>> for
>> >> > which Kafka is a better fit?
>> >> >
>> >> > Thanks,
>> >> >
>> >> > -Dragos
>> >> >
>> >>
>>

Re: Arguments for Kafka over RabbitMQ ?

Reply via email to