HI Bill,

You don't need to match the number of thread to the number of partitions in
the specific topic, for example, you have 3 partitions in topic1, but you
only set 2 threads, ideally 1 thread will receive 2 partitions and another
thread for the left one partition, it depends on the scheduling of Kafka
itself, basically the data will not be lost.

But you don't need to set the thread number which is larger than the
partition number, since each partition can only be consumed by one
consumer, so the left threads will be wasted.


2015-05-19 7:46 GMT+08:00 Bill Jay <bill.jaypeter...@gmail.com>:

> Hi all,
>
> I am reading the docs of receiver-based Kafka consumer. The last
> parameters of KafkaUtils.createStream is per topic number of Kafka
> partitions to consume. My question is, does the number of partitions for
> topic in this parameter need to match the number of partitions in Kafka.
>
> For example, I have two topics, topic1 with 3 partitions and topic2 with 4
> partitions.
>
> If i specify 2 for topic1 and 3 for topic2 and feed them to the
> createStream function, will there be data loss? Or it will just be an
> inefficiency.
>
> Thanks!
>
> Bill
>

Reply via email to