How to achieve distributed processing and high availability simultaneously in Kafka?

sumit jain Tue, 05 May 2015 22:58:41 -0700

I have a topic consisting of n partitions. To have distributed processing I
create two processes running on different machines. They subscribe to the
topic with same groupd id and allocate n/2 threads, each of which processes
single stream(n/2 partitions per process).


With this I will have achieved load distribution, but now if process 1
crashes, than process 2 cannot consume messages from partitions allocated
to process 1, as it listened only on n/2 streams at the start.

Or else, if I configure for HA and start n threads/streams on both
processes, then when one node fails, all partitions will be processed by
other node. But here, we have compromised distribution, as all partitions
will be processed by a single node at a time.

Is there a way to achieve both simultaneously and how?
Note: Already asked this on stackoverflow
http://stackoverflow.com/questions/30060261/how-to-achieve-distributed-processing-and-high-availability-simultaneously-in-ka
.
-- 
Thanks & Regards,
Sumit Jain

How to achieve distributed processing and high availability simultaneously in Kafka?

Reply via email to