A few more details for those following this: On Sat, Jun 8, 2013 at 9:09 PM, Alexis Richardson <alexis.richard...@gmail.com> wrote: > Jonathan > > I am aware of the difference between sequential writes and other kinds > of writes ;p) > > AFAIK the Kafka docs describe a sort of platonic alternative system, > eg "normally people do this.. Kafka does that..". This is a good way > to explain design decisions. However, I think you may be assuming > that Rabbit is a lot like the generalised other system. But it is not > - eg Rabbit does not do lots of random IO. I'm led to understand that > Rabbit's msg store is closer to log structured storage (a la > Log-Structured Merge Trees) in some ways. ... >> >> That would be awesome if you can confirm what Rabbit is using as a >> persistent data structure.
See extensive comments in here: http://hg.rabbitmq.com/rabbitmq-server/file/bc2fda987fe8/src/rabbit_msg_store.erl >> More importantly, whether it is BTree or >> something else, is the disk i/o random or linear? .. >> This is only speaking of the use case of high throughput with persisting >> large amounts of data to disk where there is 4 orders of magnitude more >> than 10x difference. It all comes down to random vs sequential >> writes/reads to disk as I mentioned above. It's not a btree with random writes, hence my puzzlement earlier. * there are mostly linear writes in a file * multiple files are involved, moved around, garbage collected, compacted, etc, which is obviously not all linear. This will behave better than a btree for the purpose it was built for. This is just for writes. Reads may be a different story - and I don't fully understand how reads work in Kafka. A memory mapped circular buffer will definitely outperform this... mmap support for erlang would be nice ;p) >> >> On Sat, Jun 8, 2013 at 2:07 AM, Alexis Richardson < >> alexis.richard...@gmail.com> wrote: >> >>> Jonathan >>> >>> On Sat, Jun 8, 2013 at 2:09 AM, Jonathan Hodges <hodg...@gmail.com> wrote: >>> > Thanks so much for your replies. This has been a great help >>> understanding >>> > Rabbit better with having very little experience with it. I have a few >>> > follow up comments below. >>> >>> Happy to help! >>> >>> I'm afraid I don't follow your arguments below. Rabbit contains many >>> optimisations too. I'm told that it is possible to saturate the disk >>> i/o, and you saw the message rates I quoted in the previous email. >>> YES of course there are differences, mostly an accumulation of things. >>> For example Rabbit spends more time doing work before it writes to >>> disk. >>> >>> You said: >>> >>> "Since Rabbit must maintain the state of the >>> consumers I imagine it’s subjected to random data access patterns on disk >>> as opposed to sequential." >>> >>> I don't follow the logic here, sorry. >>> >>> Couple of side comments: >>> >>> * In your Hadoop vs RT example, Rabbit would deliver the RT messages >>> immediately and write the rest to disk. It can do this at high rates >>> - I shall try to get you some useful data here. >>> >>> * Bear in mind that write speed should be orthogonal to read speed. >>> Ask yourself - how would Kafka provide a read cache, and when might >>> that be useful? >>> >>> * I'll find out what data structure Rabbit uses for long term persistence. >>> >>> >>> "Quoting the Kafka design page ( >>> http://kafka.apache.org/07/design.html) performance of sequential writes >>> on >>> a 6 7200rpm SATA RAID-5 array is about 300MB/sec but the performance of >>> random writes is only about 50k/sec—a difference of nearly 10000X." >>> >>> Depending on your use case, I'd expect 2x-10x overall throughput >>> differences, and will try to find out more info. As I said, Rabbit >>> can saturate disk i/o. >>> >>> alexis >>> >>> >>> >>> >>> > >>> >> While you are correct the payload is a much bigger concern, managing the >>> >> metadata and acks centrally on the broker across multiple clients at >>> scale >>> >> is also a concern. This would seem to be exasperated if you have >>> > consumers >>> >> at different speeds i.e. Storm and Hadoop consuming the same topic. >>> >> >>> >> In that scenario, say storm consumes the topic messages in real-time and >>> >> Hadoop consumes once a day. Let’s assume the topic consists of 100k+ >>> >> messages/sec throughput so that in a given day you might have 100s GBs >>> of >>> >> data flowing through the topic. >>> >> >>> >> To allow Hadoop to consume once a day, Rabbit obviously can’t keep 100s >>> > GBs >>> >> in memory and will need to persist this data to its internal DB to be >>> >> retrieved later. >>> > >>> > I am not sure why you think this is a problem? >>> > >>> > For a fixed number of producers and consumers, the pubsub and delivery >>> > semantics of Rabbit and Kafka are quite similar. Think of Rabbit as >>> > adding an in-memory cache that is used to (a) speed up read >>> > consumption, (b) obviate disk writes when possible due to all client >>> > consumers being available and consuming. >>> > >>> > >>> > Actually I think this is the main use case that sets Kafka apart from >>> > Rabbit and speaks to the poster’s ‘Arguments for Kafka over RabbitMQ’ >>> > question. As you mentioned Rabbit is a general purpose messaging system >>> > and along with that has a lot of features not found in Kafka. There are >>> > plenty of times when Rabbit makes more sense than Kafka, but not when you >>> > are maintaining large message stores and require high throughput to disk. >>> > >>> > Persisting 100s GBs of messages to disk is a much different problem than >>> > managing messages in memory. Since Rabbit must maintain the state of the >>> > consumers I imagine it’s subjected to random data access patterns on disk >>> > as opposed to sequential. Quoting the Kafka design page ( >>> > http://kafka.apache.org/07/design.html) performance of sequential >>> writes on >>> > a 6 7200rpm SATA RAID-5 array is about 300MB/sec but the performance of >>> > random writes is only about 50k/sec—a difference of nearly 10000X. >>> > >>> > They go on to say persistent data structure used in messaging systems >>> > metadata is often a BTree. BTrees are the most versatile data structure >>> > available, and make it possible to support a wide variety of >>> transactional >>> > and non-transactional semantics in the messaging system. They do come >>> with >>> > a fairly high cost, though: Btree operations are O(log N). Normally O(log >>> > N) is considered essentially equivalent to constant time, but this is not >>> > true for disk operations. Disk seeks come at 10 ms a pop, and each disk >>> can >>> > do only one seek at a time so parallelism is limited. Hence even a >>> handful >>> > of disk seeks leads to very high overhead. Since storage systems mix very >>> > fast cached operations with actual physical disk operations, the observed >>> > performance of tree structures is often superlinear. Furthermore BTrees >>> > require a very sophisticated page or row locking implementation to avoid >>> > locking the entire tree on each operation. The implementation must pay a >>> > fairly high price for row-locking or else effectively serialize all >>> reads. >>> > Because of the heavy reliance on disk seeks it is not possible to >>> > effectively take advantage of the improvements in drive density, and one >>> is >>> > forced to use small (< 100GB) high RPM SAS drives to maintain a sane >>> ratio >>> > of data to seek capacity. >>> > >>> > Intuitively a persistent queue could be built on simple reads and appends >>> > to files as is commonly the case with logging solutions. Though this >>> > structure would not support the rich semantics of a BTree implementation, >>> > but it has the advantage that all operations are O(1) and reads do not >>> > block writes or each other. This has obvious performance advantages since >>> > the performance is completely decoupled from the data size--one server >>> can >>> > now take full advantage of a number of cheap, low-rotational speed 1+TB >>> > SATA drives. Though they have poor seek performance, these drives often >>> > have comparable performance for large reads and writes at 1/3 the price >>> and >>> > 3x the capacity. >>> > >>> > Having access to virtually unlimited disk space without penalty means >>> that >>> > we can provide some features not usually found in a messaging system. For >>> > example, in kafka, instead of deleting a message immediately after >>> > consumption, we can retain messages for a relative long period (say a >>> week). >>> > >>> > Our assumption is that the volume of messages is extremely high, indeed >>> it >>> > is some multiple of the total number of page views for the site (since a >>> > page view is one of the activities we process). Furthermore we assume >>> each >>> > message published is read at least once (and often multiple times), hence >>> > we optimize for consumption rather than production. >>> > >>> > There are two common causes of inefficiency: too many network requests, >>> and >>> > excessive byte copying. >>> > >>> > To encourage efficiency, the APIs are built around a "message set" >>> > abstraction that naturally groups messages. This allows network requests >>> to >>> > group messages together and amortize the overhead of the network >>> roundtrip >>> > rather than sending a single message at a time. >>> > >>> > The MessageSet implementation is itself a very thin API that wraps a byte >>> > array or file. Hence there is no separate serialization or >>> deserialization >>> > step required for message processing, message fields are lazily >>> > deserialized as needed (or not deserialized if not needed). >>> > >>> > The message log maintained by the broker is itself just a directory of >>> > message sets that have been written to disk. This abstraction allows a >>> > single byte format to be shared by both the broker and the consumer (and >>> to >>> > some degree the producer, though producer messages are checksumed and >>> > validated before being added to the log). >>> > >>> > Maintaining this common format allows optimization of the most important >>> > operation: network transfer of persistent log chunks. Modern unix >>> operating >>> > systems offer a highly optimized code path for transferring data out of >>> > pagecache to a socket; in Linux this is done with the sendfile system >>> call. >>> > Java provides access to this system call with the FileChannel.transferTo >>> > api. >>> > >>> > To understand the impact of sendfile, it is important to understand the >>> > common data path for transfer of data from file to socket: >>> > >>> > 1. The operating system reads data from the disk into pagecache in >>> kernel >>> > space >>> > 2. The application reads the data from kernel space into a user-space >>> > buffer >>> > 3. The application writes the data back into kernel space into a socket >>> > buffer >>> > 4. The operating system copies the data from the socket buffer to the >>> NIC >>> > buffer where it is sent over the network >>> > >>> > This is clearly inefficient, there are four copies, two system calls. >>> Using >>> > sendfile, this re-copying is avoided by allowing the OS to send the data >>> > from pagecache to the network directly. So in this optimized path, only >>> the >>> > final copy to the NIC buffer is needed. >>> > >>> > We expect a common use case to be multiple consumers on a topic. Using >>> the >>> > zero-copy optimization above, data is copied into pagecache exactly once >>> > and reused on each consumption instead of being stored in memory and >>> copied >>> > out to kernel space every time it is read. This allows messages to be >>> > consumed at a rate that approaches the limit of the network connection. >>> > >>> > >>> > So in the end it would seem Kafka’s specialized nature to write data >>> first >>> > really shines over Rabbit when your use case requires a very high >>> > throughput unblocking firehose with large data persistence to disk. >>> Since >>> > this is only one use case this by no means is saying Kafka is better than >>> > Rabbit or vice versa. I think it is awesome there are more options to >>> > choose from so you can pick the right tool for the job. Thanks open >>> source! >>> > >>> > As always YMMV. >>> > >>> > >>> > >>> > On Fri, Jun 7, 2013 at 4:40 PM, Alexis Richardson < >>> > alexis.richard...@gmail.com> wrote: >>> > >>> >> Jonathan, >>> >> >>> >> >>> >> On Fri, Jun 7, 2013 at 7:03 PM, Jonathan Hodges <hodg...@gmail.com> >>> wrote: >>> >> > Hi Alexis, >>> >> > >>> >> > I appreciate your reply and clarifications to my misconception about >>> >> > Rabbit, particularly on the copying of the message payloads per >>> consumer. >>> >> >>> >> Thank-you! >>> >> >>> >> >>> >> > It sounds like it only copies metadata like the consumer state i.e. >>> >> > position in the topic messages. >>> >> >>> >> Basically yes. Of course when a message is delivered to N>1 >>> >> *machines*, then there will be N copies, one per machine. >>> >> >>> >> Also, for various reasons, very tiny (<60b) messages do get copied as >>> >> you'd assumed. >>> >> >>> >> >>> >> > I don’t have experience with Rabbit and >>> >> > was basing this assumption based on Google searches like the >>> following - >>> >> > >>> >> >>> http://ilearnstack.com/2013/04/16/introduction-to-amqp-messaging-with-rabbitmq/ >>> >> . >>> >> > It seems to indicate with topic exchanges that the messages get >>> copied >>> >> to >>> >> > a queue per consumer, but I am glad you confirmed it is just the >>> >> metadata. >>> >> >>> >> Yup. >>> >> >>> >> That's a fairly decent article but even the good stuff uses words like >>> >> "copy" without a fixed denotation. Don't believe the internets! >>> >> >>> >> >>> >> > While you are correct the payload is a much bigger concern, managing >>> the >>> >> > metadata and acks centrally on the broker across multiple clients at >>> >> scale >>> >> > is also a concern. This would seem to be exasperated if you have >>> >> consumers >>> >> > at different speeds i.e. Storm and Hadoop consuming the same topic. >>> >> > >>> >> > In that scenario, say storm consumes the topic messages in real-time >>> and >>> >> > Hadoop consumes once a day. Let’s assume the topic consists of 100k+ >>> >> > messages/sec throughput so that in a given day you might have 100s >>> GBs of >>> >> > data flowing through the topic. >>> >> > >>> >> > To allow Hadoop to consume once a day, Rabbit obviously can’t keep >>> 100s >>> >> GBs >>> >> > in memory and will need to persist this data to its internal DB to be >>> >> > retrieved later. >>> >> >>> >> I am not sure why you think this is a problem? >>> >> >>> >> For a fixed number of producers and consumers, the pubsub and delivery >>> >> semantics of Rabbit and Kafka are quite similar. Think of Rabbit as >>> >> adding an in-memory cache that is used to (a) speed up read >>> >> consumption, (b) obviate disk writes when possible due to all client >>> >> consumers being available and consuming. >>> >> >>> >> >>> >> > I believe when large amounts of data need to be persisted >>> >> > is the scenario described in the earlier posted Kafka paper ( >>> >> > >>> >> >>> http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf >>> >> ) >>> >> > where Rabbit’s performance really starts to bog down as compared to >>> >> Kafka. >>> >> >>> >> Not sure what parts of the paper you mean? >>> >> >>> >> I read that paper when it came out. I found it strongest when >>> >> describing Kafka's design philosophy. I found the performance >>> >> statements made about Rabbit pretty hard to understand. This is not >>> >> meant to be a criticism of the authors! I have seen very few >>> >> performance papers about messaging that I would base decisions on. >>> >> >>> >> >>> >> > This Kafka paper is looks to be a few years old >>> >> >>> >> Um.... Lots can change in technology very quickly :-) >>> >> >>> >> Eg.: At the time this paper was published, Instagram had 5m users. >>> >> Six months earlier in Dec 2010, it had 1m. Since then it grew huge >>> >> and got acquired. >>> >> >>> >> >>> >> >>> >> > so has something changed >>> >> > within the Rabbit architecture to alleviate this issue when large >>> amounts >>> >> > of data are persisted to the internal DB? >>> >> >>> >> Rabbit introduced a new internal flow control system which impacted >>> >> performance under steady load. This may be relevant? I couldn't say >>> >> from reading the paper. >>> >> >>> >> I don't have a good reference for this to hand, but here is a post >>> >> about external flow control that you may find amusing: >>> >> >>> >> >>> http://www.rabbitmq.com/blog/2012/05/11/some-queuing-theory-throughput-latency-and-bandwidth/ >>> >> >>> >> >>> >> > Do the producer and consumer >>> >> > numbers look correct? If no, maybe you can share some Rabbit >>> benchmarks >>> >> > under this scenario, because I believe it is the main area where Kafka >>> >> > appears to be the superior solution. >>> >> >>> >> This is from about one year ago: >>> >> >>> >> >>> http://www.rabbitmq.com/blog/2012/04/25/rabbitmq-performance-measurements-part-2/ >>> >> >>> >> Obviously none of this uses batching, which is an easy trick for >>> >> increasing throughput. >>> >> >>> >> YMMV. >>> >> >>> >> Is this helping? >>> >> >>> >> alexis >>> >> >>> >> >>> >> >>> >> > Thanks for educating me on these matters. >>> >> > >>> >> > -Jonathan >>> >> > >>> >> > >>> >> > >>> >> > On Fri, Jun 7, 2013 at 6:54 AM, Alexis Richardson < >>> ale...@rabbitmq.com >>> >> >wrote: >>> >> > >>> >> >> Hi >>> >> >> >>> >> >> Alexis from Rabbit here. I hope I am not intruding! >>> >> >> >>> >> >> It would be super helpful if people with questions, observations or >>> >> >> moans posted them to the rabbitmq list too :-) >>> >> >> >>> >> >> A few comments: >>> >> >> >>> >> >> * Along with ZeroMQ, I consider Kafka to be one of the interesting >>> and >>> >> >> useful messaging projects out there. In a world of cruft, Kafka is >>> >> >> cool! >>> >> >> >>> >> >> * This is because both projects come at messaging from a specific >>> >> >> point of view that is *different* from Rabbit. OTOH, many other >>> >> >> projects exist that replicate Rabbit features for fun, or NIH, or due >>> >> >> to misunderstanding the semantics (yes, our docs could be better) >>> >> >> >>> >> >> * It is striking how few people describe those differences. In a >>> >> >> nutshell they are as follows: >>> >> >> >>> >> >> *** Kafka writes all incoming data to disk immediately, and then >>> >> >> figures out who sees what. So it is much more like a database than >>> >> >> Rabbit, in that new consumers can appear well after the disk write >>> and >>> >> >> still subscribe to past messages. Instead, Rabbit which tries to >>> >> >> deliver to consumers and buffers otherwise. Persistence is optional >>> >> >> but robust and a feature of the buffer ("queue") not the upstream >>> >> >> machinery. Rabbit is able to cache-on-arrival via a plugin, but this >>> >> >> is a design overlay and not particularly optimal. >>> >> >> >>> >> >> *** Kafka is a client server system with end to end semantics. It >>> >> >> defines order to include processing order, and keeps state on the >>> >> >> client to do this. Group management is via a 3rd party service >>> >> >> (Zookeeper? I forget which). Rabbit is a server-only protocol based >>> >> >> system which maintains order on the server and through completely >>> >> >> language neutral protocol semantics. This makes Rabbit perhaps more >>> >> >> natural as a 'messaging service' eg for integration and other >>> >> >> inter-app data transfer. >>> >> >> >>> >> >> *** Rabbit is a general purpose messaging system with extras like >>> >> >> federation. It speaks many protocols, and has core features like HA, >>> >> >> transactions, management, etc. Everything can be switched on or off. >>> >> >> Getting all this to work while keeping the install light and fast, is >>> >> >> quite fiddly. Kafka by contrast comes from a specific set of use >>> >> >> cases, which are interesting certainly. I am not sure if Kafka wants >>> >> >> to be a general purpose messaging system, but it will become a bit >>> >> >> more like Rabbit if that is the goal. >>> >> >> >>> >> >> *** Both approaches have costs. In the case of Rabbit the cost is >>> >> >> that more metadata is stored on the broker. Kafka can get >>> performance >>> >> >> gains by storing less such data. But we are talking about some N >>> >> >> thousands of MPS versus some M thousands. At those speeds the >>> clients >>> >> >> are usually the bottleneck anyway. >>> >> >> >>> >> >> * Let me also clarify some things: >>> >> >> >>> >> >> *** Rabbit does NOT store multiple copies of the same message across >>> >> >> queues, unless they are very small (<60b, iirc). A message delivered >>> >> >> to >1 queue on 1 machine is stored once. Metadata about that message >>> >> >> may be stored more than once, but, at scale, the big cost is the >>> >> >> payload. >>> >> >> >>> >> >> *** Rabbit's vanilla install does store some index data in memory >>> when >>> >> >> messages flow to disk. You can change this by using a plugin, but >>> >> >> this is a secret-menu undocumented feature. Very very few people >>> need >>> >> >> any such thing. >>> >> >> >>> >> >> *** A Rabbit queue is lightweight. It's just an ordered consumption >>> >> >> buffer that can persist and ack. Don't assume things about Rabbit >>> >> >> queues based on what you know about IBM MQ, JMS, and so forth. >>> Queues >>> >> >> in Rabbit and Kafka are not the same. >>> >> >> >>> >> >> *** Rabbit does not use mnesia for message storage. It has its own >>> >> >> DB, optimised for messaging. You can use other DBs but this is >>> >> >> Complicated. >>> >> >> >>> >> >> *** Rabbit does all kinds of batching and bulk processing, and can >>> >> >> batch end to end. If you see claims about batching, buffering, etc., >>> >> >> find out ALL the details before drawing conclusions. >>> >> >> >>> >> >> I hope this is helpful. >>> >> >> >>> >> >> Keen to get feedback / questions / corrections. >>> >> >> >>> >> >> alexis >>> >> >> >>> >> >> >>> >> >> >>> >> >> >>> >> >> >>> >> >> >>> >> >> >>> >> >> On Fri, Jun 7, 2013 at 2:09 AM, Marc Labbe <mrla...@gmail.com> >>> wrote: >>> >> >> > We also went through the same decision making and our arguments for >>> >> Kafka >>> >> >> > where in the same lines as those Jonathan mentioned. The fact that >>> we >>> >> >> have >>> >> >> > heterogeneous consumers is really a deciding factor. Our >>> requirements >>> >> >> were >>> >> >> > to avoid loosing messages at all cost while having multiple >>> consumers >>> >> >> > reading the same data at a different pace. On one side, we have a >>> few >>> >> >> > consumers being fed with data coming in from most, if not all, >>> >> topics. On >>> >> >> > the other side, we have a good bunch of consumers reading only >>> from a >>> >> >> > single topic. The big guys can take their time to read while the >>> >> smaller >>> >> >> > ones are mostly for near real-time events so they need to keep up >>> the >>> >> >> pace >>> >> >> > of incoming messages. >>> >> >> > >>> >> >> > RabbitMQ stores data on disk only if you tell it to while Kafka >>> >> persists >>> >> >> by >>> >> >> > design. From the beginning, we decided we would try to use the >>> queues >>> >> the >>> >> >> > same way, pub/sub with a routing key (an exchange in RabbitMQ) or >>> >> topic, >>> >> >> > persisted to disk and replicated. >>> >> >> > >>> >> >> > One of our scenario was to see how the system would cope with the >>> >> largest >>> >> >> > consumer down for a while, therefore forcing the brokers to keep >>> the >>> >> data >>> >> >> > for a long period. In the case of RabbitMQ, this consumer has it >>> owns >>> >> >> queue >>> >> >> > and data grows on disk, which is not really a problem if you plan >>> >> >> > consequently. But, since it has to keep track of all messages read, >>> >> the >>> >> >> > Mnesia database used by RabbitMQ as the messages index also grows >>> >> pretty >>> >> >> > big. At that point, the amount of RAM necessary becomes very large >>> to >>> >> >> keep >>> >> >> > the level of performance we need. In our tests, we found that this >>> an >>> >> >> > adverse effect on ALL the brokers, thus affecting all consumers. >>> You >>> >> can >>> >> >> > always say that you'll monitor the consumers to make sure it won't >>> >> >> happen. >>> >> >> > That's a good thing if you can. I wasn't ready to make that bet. >>> >> >> > >>> >> >> > Another point is the fact that, since we wanted to use pub/sub >>> with a >>> >> >> > exchange in RabbitMQ, we would have ended up with a lot data >>> >> duplication >>> >> >> > because if a message is read by multiple consumers, it will get >>> >> >> duplicated >>> >> >> > in the queue of each of those consumer. Kafka wins on that side too >>> >> since >>> >> >> > every consumer reads from the same source. >>> >> >> > >>> >> >> > The downsides of Kafka were the language issues (we are using >>> mostly >>> >> >> Python >>> >> >> > and C#). 0.8 is very new and few drivers are available at this >>> point. >>> >> >> Also, >>> >> >> > we will have to try getting as close as possible to >>> once-and-only-once >>> >> >> > guarantee. There are two things where RabbitMQ would have given us >>> >> less >>> >> >> > work out of the box as opposed to Kafka. RabbitMQ also provides a >>> >> bunch >>> >> >> of >>> >> >> > tools that makes it rather attractive too. >>> >> >> > >>> >> >> > In the end, looking at throughput is a pretty nifty thing but being >>> >> sure >>> >> >> > that I'll be able to manage the beast as it grows will allow me to >>> >> get to >>> >> >> > sleep way more easily. >>> >> >> > >>> >> >> > >>> >> >> > On Thu, Jun 6, 2013 at 3:28 PM, Jonathan Hodges <hodg...@gmail.com >>> > >>> >> >> wrote: >>> >> >> > >>> >> >> >> We just went through a similar exercise with RabbitMQ at our >>> company >>> >> >> with >>> >> >> >> streaming activity data from our various web properties. Our use >>> >> case >>> >> >> >> requires consumption of this stream by many heterogeneous >>> consumers >>> >> >> >> including batch (Hadoop) and real-time (Storm). We pointed out >>> that >>> >> >> Kafka >>> >> >> >> acts as a configurable rolling window of time on the activity >>> stream. >>> >> >> The >>> >> >> >> window default is 7 days which allows for supporting clients of >>> >> >> different >>> >> >> >> latencies like Hadoop and Storm to read from the same stream. >>> >> >> >> >>> >> >> >> We pointed out that the Kafka brokers don't need to maintain >>> consumer >>> >> >> state >>> >> >> >> in the stream and only have to maintain one copy of the stream to >>> >> >> support N >>> >> >> >> number of consumers. Rabbit brokers on the other hand have to >>> >> maintain >>> >> >> the >>> >> >> >> state of each consumer as well as create a copy of the stream for >>> >> each >>> >> >> >> consumer. In our scenario we have 10-20 consumers and with the >>> scale >>> >> >> and >>> >> >> >> throughput of the activity stream we were able to show Rabbit >>> quickly >>> >> >> >> becomes the bottleneck under load. >>> >> >> >> >>> >> >> >> >>> >> >> >> >>> >> >> >> On Thu, Jun 6, 2013 at 12:40 PM, Dragos Manolescu < >>> >> >> >> dragos.manole...@servicenow.com> wrote: >>> >> >> >> >>> >> >> >> > Hi -- >>> >> >> >> > >>> >> >> >> > I am preparing to make a case for using Kafka instead of Rabbit >>> MQ >>> >> as >>> >> >> a >>> >> >> >> > broker-based messaging provider. The context is similar to that >>> of >>> >> the >>> >> >> >> > Kafka papers and user stories: the producers publish monitoring >>> >> data >>> >> >> and >>> >> >> >> > logs, and a suite of subscribers consume this data (some store >>> it, >>> >> >> others >>> >> >> >> > perform computations on the event stream). The requirements are >>> >> >> typical >>> >> >> >> of >>> >> >> >> > this context: low-latency, high-throughput, ability to deal with >>> >> >> bursts >>> >> >> >> and >>> >> >> >> > operate in/across multiple data centers, etc. >>> >> >> >> > >>> >> >> >> > I am familiar with the performance comparison between Kafka, >>> >> Rabbit MQ >>> >> >> >> and >>> >> >> >> > Active MQ from the NetDB 2011 paper< >>> >> >> >> > >>> >> >> >> >>> >> >> >>> >> >>> http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf >>> >> >> >> >. >>> >> >> >> > However in the two years that passed since then the number of >>> >> >> production >>> >> >> >> > Kafka installations increased, and people are using it in >>> different >>> >> >> ways >>> >> >> >> > than those imagined by Kafka's designers. In light of these >>> >> >> experiences >>> >> >> >> one >>> >> >> >> > can use more data points and color when contrasting to Rabbit MQ >>> >> >> (which >>> >> >> >> by >>> >> >> >> > the way also evolved since 2011). (And FWIW I know I am not the >>> >> first >>> >> >> one >>> >> >> >> > to walk this path; see for example last year's OSCON session on >>> the >>> >> >> State >>> >> >> >> > of MQ<http://lanyrd.com/2012/oscon/swrcz/>.) >>> >> >> >> > >>> >> >> >> > I would appreciate it if you could share measurements, results, >>> or >>> >> >> even >>> >> >> >> > anecdotal evidence along these lines. How have you avoided the >>> >> "let's >>> >> >> use >>> >> >> >> > Rabbit MQ because everybody else does it" route when solving >>> >> problems >>> >> >> for >>> >> >> >> > which Kafka is a better fit? >>> >> >> >> > >>> >> >> >> > Thanks, >>> >> >> >> > >>> >> >> >> > -Dragos >>> >> >> >> > >>> >> >> >> >>> >> >> >>> >> >>>