Hi Stefan,

Have you looked at the following output for message distribution
across the topic-partitions and which topic-partition is consumed by
which consumer thread?

kafaka-server/bin>./kafka-run-class.sh
kafka.tools.ConsumerOffsetChecker --zkconnect localhost:2181 --group
<consumer_group_name>

Jagbir

On Wed, Jul 15, 2015 at 12:50 PM, Stefan Miklosovic
<mikloso...@gmail.com> wrote:
> I have following problem, I tried almost everything I could but without any 
> luck
>
> All I want to do is to have 1 producer, 1 topic, 10 partitions and 10 
> consumers.
>
> All I want is to send 1M of messages via producer to these 10 consumers.
>
> I am using built Kafka 0.8.3 from current upstream so I have bleeding
> edge stuff. It does not work on 0.8.1.1 nor 0.8.2 stream.
>
> The problem I have is that I expect that when I send 1 milion of
> messages via that producer, I will have all consumers busy. In other
> words, if a message to be sent via producer is sent to partition
> randomly (roundrobin / range), I expect that all 10 consumers will
> process about 100k of messages each because producer sends it to
> random partition of these 10.
>
> But I have never achieved such outcome.
>
> I was trying these combinations:
>
> 1) old scala producer vs old scala consumer
>
> Consumer was created by Consumers.createJavaConsumer() ten times.
> Every consumer is running in the separate thread.
>
> 2) old scala producer vs new java consumer
>
> new consumer was used like I have 10 consumers listening for a topic
> and 10 consumers subscribed to 1 partition. (consumer 1 - partition 1,
> consumer 2 - paritition 2 and so on)
>
> 3) old scala producer with custom partitioner
>
> I even tried to use my own partitioner, I just generated a random
> number from 0 to 9 so I expected that the messages will be sent
> randomly to the partition of that number.
>
> All I see is that there are only couple of consumers from these 10
> utilized, even I am sending 1M of messages, all I got from the
> debugging output is some preselected set of consumers which appear to
> be selected randomly.
>
> Do you have ANY hint why all consumers are not utilized even
> partitions are selected randomly?
>
> My initial suspicion was that rebalancing was done badly. The think
> was I was generating old consumers in a loop quicky one after another
> and I can imaging that rebalancing algorithm got mad.
>
> So I abandon this solution and I was thinking that let's just
> subscribe these consumers one by one to some partition so I will have
> 1 consumer subscribed just to 1 partition and there will not be any
> rebalancing at all.
>
> Oh my how wrong was I ... nothing changed.
>
> So I was thinking that if I have 10 consumers, each one subscribed to
> 1 paritition, maybe producer is just sending messages to some set of
> partitions and that's it. I  was not sure how this can be possible so
> to be super sure about the even spreading of message to partitions, I
> used custom partitioner class in old consumer so I will be sure that
> the partition the message will be sent to is super random.
>
> But that does not seems to work either.
>
> Please people, help me.
>
> --
> Stefan Miklosovic

Reply via email to