A consumer thread can consume multiple partitions. This is not unusual, in practice.
In the example you gave, if multiple high-level consumers are using the same group id, they will automatically rebalance the partition assignment between them as consumers dynamically join and leave the group. So, in your example, if process 1 dies, then process 2 will assume ownership for all the n partitions (and if it has n/2 threads, each thread will own 2 of the partitions). In my experience though, its generally fine to have fewer threads than partitions. It depends on the volume of data incoming to each partition of course, and how fast the consumer takes to process each message. Jason On Wed, May 6, 2015 at 1:57 AM, sumit jain <sumitjai...@gmail.com> wrote: > I have a topic consisting of n partitions. To have distributed processing I > create two processes running on different machines. They subscribe to the > topic with same groupd id and allocate n/2 threads, each of which processes > single stream(n/2 partitions per process). > > With this I will have achieved load distribution, but now if process 1 > crashes, than process 2 cannot consume messages from partitions allocated > to process 1, as it listened only on n/2 streams at the start. > > Or else, if I configure for HA and start n threads/streams on both > processes, then when one node fails, all partitions will be processed by > other node. But here, we have compromised distribution, as all partitions > will be processed by a single node at a time. > > Is there a way to achieve both simultaneously and how? > Note: Already asked this on stackoverflow > > http://stackoverflow.com/questions/30060261/how-to-achieve-distributed-processing-and-high-availability-simultaneously-in-ka > . > -- > Thanks & Regards, > Sumit Jain >