Hello,

I am trying to come up with a design for consuming from Kafka.  *I am using
0.8.1.1 version of Kafka. *I am thinking of designing a system where the
consumer will be created every few seconds, consume the data from Kafka,
process it and then quits after committing the offsets to Kafka. At any
point of time expect 250 - 300 consumers to be active (running as
ThreadPools in different machines).

1. How and When a rebalance of partition happens?

2. How costly is the rebalancing of partitions among the consumers. I am
expecting a new consumer finishing up or joining every few seconds to the
same consumer group. So I just want to know the overhead and latency of a
rebalancing operation.

3. Say Consumer C1 has Partitions P1, P2, P3 assigned to it and it is
processing a message M1 from Partition P1. Now Consumer C2 joins the
group.  How is the partitions divided between C1 and C2. Is there a
possibility where C1's (which might take some time to commit its message to
Kafka) commit for M1 will be rejected and M1 will be treated as a fresh
message and will be delivered to someone else (I know Kafka is at least
once delivery model but wanted to confirm if the re partition by any chance
cause a re delivery of the same message)?


Thanks,
Dinesh

Reply via email to