mps, an eventing framework for kafka

2013-06-14 Thread Milind Parikh
There is a lot of useful discussion on the user group about data messages and eventing, I thought that I would introduce mps, an eventing framework for kafka. It is written in Erlang and available on githubof at github.com/milindparikh/mps. The README on github explains the philosophy with which

Re: Using Kafka for "data" messages

2013-06-14 Thread Josh Foure
Hi Mahendra, thanks for your reply.  I was planning on using the Atmosphere Framework (http://async-io.org/)  to handle the web push stuff (I've never used it before but we use PrimeFaces a little and that's what they use for their components).  I thought that I would have the JVM that the user

Re: Stall high-level 0.72 ConsumerConnector until all balanced? Avoid message dupes?

2013-06-14 Thread Philip O'Toole
On Thu, Jun 13, 2013 at 9:15 PM, Jun Rao wrote: > Are you messages compressed in batches? If so, some dups are expected > during rebalance. In 0.8, such dups are eliminated. Other than that, > rebalance shouldn't cause dups since we commit consumed offsets to ZK > before doing a rebalance. Jun --

Re: Kafka 0.8 Maven and IntelliJ

2013-06-14 Thread Dragos Manolescu
I use 12.1.4 Ultimate on OS X. -Dragos On 6/13/13 9:07 PM, "Jun Rao" wrote: >Thanks. Which version of Intellij are you using? > >Jun > > >On Thu, Jun 13, 2013 at 10:20 AM, Dragos Manolescu < >dragos.manole...@servicenow.com> wrote: > >> Hmm, I've just pulled 0.8.0-beta1-candidate1, removed .ide

0.8 backup strategy anyone?

2013-06-14 Thread Scott Clasen
So despite 0.8 being a release that will give much higher availability do people do anything at all to back up the data? For instance if any of you are running on EC2 and using ephemeral disk for perf reasons, what do you do about messages that you absolutely cant afford to lose. Basically look

Kafka stats description

2013-06-14 Thread Hanish Bansal
Hi I'm trying to fetch Kafka stats using JMX and able to fetch the parameters mentioned below. Can somebody please elaborate on what each of these parameters signify. Although the name is pretty indicative but I still want to be sure. As per my understanding: ProduceRequestsPerSecond - number of

Re: Amazon SNS and Kafka comparison

2013-06-14 Thread Philip O'Toole
Depends how important being able to access every single bit of the messages are, right down to looking at what is on the disk. It's very important to us, we need that control. Ability to scale throughout as needed is also important - too important to do anything but run it ourselves. All these

Re: message order, guarenteed?

2013-06-14 Thread Philip O'Toole
Another idea. If a set of messages arrive over a single TCP connection, route to a partition depending on TCP connection. To be honest, these approaches, while they work, may not scale when the message rate is high. If at all possible, try to think of a way to remove this requirement from your

Amazon SNS and Kafka comparison

2013-06-14 Thread James Newhaven
Hi, I have a system that needs to process tens of thousands of user events per second. I've looked at both Kafka and Amazon SNS. Using SNS would mean I can avoid the operational overhead of maintaining Kafka and Zookeeper installations and monitoring. I also wouldn't need to worry about storage

Re: message order, guarenteed?

2013-06-14 Thread David Arthur
Simple example of how to take advantage of this behavior: Suppose you're sending document updates through Kafka. If you use the document ID as the message key and the default hash partitioner, the updates for a given document will exist on the same partition and come into the consumer in order

Re: Versioning Schema's

2013-06-14 Thread David Arthur
I've done this in the past, and it worked out well. Stored Avro schema in ZooKeeper with an integer id and prefixed each message with the id. You have to make sure when you register a new schema that it resolves with the current version (ResolvingDecoder helps with this). -David On 6/13/13 4:

Re: Using Kafka for "data" messages

2013-06-14 Thread Archie Cowan
Hi there, I'm very new to Kafka but am also keen on this use case. Is the number of topics just limited to the underlying filesystems constraints on number of files in 1 directory? There are other filesystems out there that have practical limits in the range of millions (though programs like '

Re: Using Kafka for "data" messages

2013-06-14 Thread Mahendra M
Hi Josh, Thanks for clarifying the use case. The idea is good, but I see the following three issues 1. Creating a queue for each user. There could be limits on this 2. Removing old queues 3. If the same user logs in from multiple browsers, things get a bit more complex. Can I suggest

Re: Kafka 0.8 Maven and IntelliJ

2013-06-14 Thread Jeff Liu
My Idea 12.1.4 with sbt-idea plugin version works. On Fri, Jun 14, 2013 at 12:07 PM, Jun Rao wrote: > Thanks. Which version of Intellij are you using? > > Jun > > > On Thu, Jun 13, 2013 at 10:20 AM, Dragos Manolescu < > dragos.manole...@servicenow.com> wrote: > > > Hmm, I've just pulled 0.8.