Hi Guozhang,

Thank you!

Could I say the consumer 'take turns to consume' is resulted by the correspond 
partition got the 'message write'?

The problem I am facing is my 'enrichment' (getting more data based on raw 
data) consumer took too much time to complete one message consumption. To 
explore more parallel, could I say my only choice is 'decouple consumer 
consumption with enrichment'?

Mingtao Sent from iPhone

> On Aug 12, 2014, at 1:10 AM, Guozhang Wang <wangg...@gmail.com> wrote:
> 
> Hello Mingtao,
> 
> The partition will not be re-assigned to other consumers unless the current
> consumer fails, so the behavior you described will not be expected.
> 
> Guozhang
> 
> 
> On Mon, Aug 11, 2014 at 6:27 PM, Mingtao Zhang <mail2ming...@gmail.com>
> wrote:
> 
>> Hi Guozhang,
>> 
>> I do have another Email talking about Partitions per topic. I paste it
>> within this Email.
>> 
>> I am expecting those consumers will work concurrently. The behavior I
>> observed here is consumer thread-1 will work a while, then thread-3 will
>> work, then thread-0 ..., is it normal?
>> 
>> version is 2.2.0.
>> 
>> Best Regards,
>> Mingtao
>> 
>>> On Wed, Jul 23, 2014 at 7:57 PM, Guozhang Wang <wangg...@gmail.com> wrote:
>>> 
>>> num.partitions is only used as a default value when the createTopic
>> command
>>> does not specify the num.partitions or it is automatically created. In
>> your
>>> case since you always use its value in the createTopic you will always
>> can
>>> one partition. Try change your code to sth. like:
>>> 
>>>        String[] args = new String[]{
>>>            "--zookeeper", config.getString("zookeeper"),
>>>            "--topic", config.getString("topic"),
>>>            "--replica", config.getString("replicas"),
>>>            "--partition", "8"
>>>        };
>>> 
>>>        CreateTopicCommand.main(args);
>>> 
>>> 
>>> 
>>> On Wed, Jul 23, 2014 at 4:38 PM, Mingtao Zhang <mail2ming...@gmail.com>
>>> wrote:
>>> 
>>>> Hi All,
>>>> 
>>>> In kafka.properties, I put (forgot to change):
>>>> 
>>>> num.partitions=1
>>>> 
>>>> While I create topics programatically:
>>>> 
>>>>        String[] args = new String[]{
>>>>            "--zookeeper", config.getString("zookeeper"),
>>>>            "--topic", config.getString("topic"),
>>>>            "--replica", config.getString("replicas"),
>>>>            "--partition", config.getString("partitions")
>>>>        };
>>>> 
>>>>        CreateTopicCommand.main(args);
>>>> 
>>>> The performance engineer told me only one consumer thread is actively
>>>> working even I have 4 consumer threads started (could see when
>> debugging
>>> or
>>>> in thread dump); and 4 partitions configured from the args.
>>>> 
>>>> It seems that num.partitions is still controlling the parallelism. Do I
>>>> need to change this num.partitions accordingly? Could I remove it? What
>>> is
>>>> I have different parallel requirement for different topic?
>>>> 
>>>> Thank you in advance!
>>>> 
>>>> Best Regards,
>>>> Mingtao
>> 
>> 
>>> On Mon, Aug 11, 2014 at 7:37 PM, Guozhang Wang <wangg...@gmail.com> wrote:
>>> 
>>> Mingtao,
>>> 
>>> How many partitions of the consumed topic has? Basically the data is
>>> distributed per-partition, and hence if the number of consumers is larger
>>> than the number of partitions, some consumers will not get any data.
>>> 
>>> Guozhang
>>> 
>>> 
>>> On Mon, Aug 11, 2014 at 3:29 PM, Mingtao Zhang <mail2ming...@gmail.com>
>>> wrote:
>>> 
>>>> Is it anyhow related to the issue?
>>>> 
>>>> WARN No previously checkpointed highwatermark value found for topic RAW
>>>> partition 0. Returning 0 as the highwatermark
>>>> (kafka.server.HighwaterMarkCheckpoint)
>>>> 
>>>> Mingtao
>>> 
>>> 
>>> 
>>> --
>>> -- Guozhang
> 
> 
> 
> -- 
> -- Guozhang

Reply via email to