Thanks Martin.
I got it. The design is considered for Performance improvement. Will there not 
be any harm if I have some consumers consuming from the same partitions, if I 
can tolerate slowness/performance degradation?

Regards
Bala

-----Original Message-----
From: Martin Kleppmann [mailto:mkleppm...@linkedin.com] 
Sent: Wednesday, March 05, 2014 7:52 PM
To: <users@kafka.apache.org>
Subject: Re: Reg Partition

Hi Bala,

The way Kafka works, each partition is a sequence of messages in the order that 
they were produced, and each message has a position (offset) in this sequence. 
Kafka brokers don't keep track of which consumer has seen which messages. 
Instead, each consumer keeps track of the latest offset it has seen: because 
they are consumed in sequential order, all messages with a smaller offset have 
been consumed, and all messages with a greater offset have not yet been 
consumed. Explained in detail here: 
http://kafka.apache.org/documentation.html#theconsumer

If you wanted to have several consumers consume from the same partition, they 
would have to keep communicating in order to know which one has processed which 
messages (otherwise they'd end up processing the same message twice). This 
would be extremely inefficient.

It's much easier and much more performant to assign each partition to only one 
consumer, so each consumer only needs to keep track of its own partition 
offsets. A consequence of that design is that you cannot have more consumers 
than partitions.

Martin

On 5 Mar 2014, at 10:13, Balasubramanian Jayaraman (Contingent) 
<balasubramanian.jayara...@autodesk.com> wrote:

> Hi
> 
> I have a doubt on the parallelism. Why the number of parallel consumer 
> consuming messages from a topic is restricted on the number of partitions 
> configured for a topic?
> Why should this be the case. Why should the partition affect the number of 
> parallel consumers?
> 
> Thanks
> Bala

Reply via email to