Hi Guozhang, Thank you!
Could I say the consumer 'take turns to consume' is resulted by the correspond partition got the 'message write'? The problem I am facing is my 'enrichment' (getting more data based on raw data) consumer took too much time to complete one message consumption. To explore more parallel, could I say my only choice is 'decouple consumer consumption with enrichment'? Mingtao Sent from iPhone > On Aug 12, 2014, at 1:10 AM, Guozhang Wang <wangg...@gmail.com> wrote: > > Hello Mingtao, > > The partition will not be re-assigned to other consumers unless the current > consumer fails, so the behavior you described will not be expected. > > Guozhang > > > On Mon, Aug 11, 2014 at 6:27 PM, Mingtao Zhang <mail2ming...@gmail.com> > wrote: > >> Hi Guozhang, >> >> I do have another Email talking about Partitions per topic. I paste it >> within this Email. >> >> I am expecting those consumers will work concurrently. The behavior I >> observed here is consumer thread-1 will work a while, then thread-3 will >> work, then thread-0 ..., is it normal? >> >> version is 2.2.0. >> >> Best Regards, >> Mingtao >> >>> On Wed, Jul 23, 2014 at 7:57 PM, Guozhang Wang <wangg...@gmail.com> wrote: >>> >>> num.partitions is only used as a default value when the createTopic >> command >>> does not specify the num.partitions or it is automatically created. In >> your >>> case since you always use its value in the createTopic you will always >> can >>> one partition. Try change your code to sth. like: >>> >>> String[] args = new String[]{ >>> "--zookeeper", config.getString("zookeeper"), >>> "--topic", config.getString("topic"), >>> "--replica", config.getString("replicas"), >>> "--partition", "8" >>> }; >>> >>> CreateTopicCommand.main(args); >>> >>> >>> >>> On Wed, Jul 23, 2014 at 4:38 PM, Mingtao Zhang <mail2ming...@gmail.com> >>> wrote: >>> >>>> Hi All, >>>> >>>> In kafka.properties, I put (forgot to change): >>>> >>>> num.partitions=1 >>>> >>>> While I create topics programatically: >>>> >>>> String[] args = new String[]{ >>>> "--zookeeper", config.getString("zookeeper"), >>>> "--topic", config.getString("topic"), >>>> "--replica", config.getString("replicas"), >>>> "--partition", config.getString("partitions") >>>> }; >>>> >>>> CreateTopicCommand.main(args); >>>> >>>> The performance engineer told me only one consumer thread is actively >>>> working even I have 4 consumer threads started (could see when >> debugging >>> or >>>> in thread dump); and 4 partitions configured from the args. >>>> >>>> It seems that num.partitions is still controlling the parallelism. Do I >>>> need to change this num.partitions accordingly? Could I remove it? What >>> is >>>> I have different parallel requirement for different topic? >>>> >>>> Thank you in advance! >>>> >>>> Best Regards, >>>> Mingtao >> >> >>> On Mon, Aug 11, 2014 at 7:37 PM, Guozhang Wang <wangg...@gmail.com> wrote: >>> >>> Mingtao, >>> >>> How many partitions of the consumed topic has? Basically the data is >>> distributed per-partition, and hence if the number of consumers is larger >>> than the number of partitions, some consumers will not get any data. >>> >>> Guozhang >>> >>> >>> On Mon, Aug 11, 2014 at 3:29 PM, Mingtao Zhang <mail2ming...@gmail.com> >>> wrote: >>> >>>> Is it anyhow related to the issue? >>>> >>>> WARN No previously checkpointed highwatermark value found for topic RAW >>>> partition 0. Returning 0 as the highwatermark >>>> (kafka.server.HighwaterMarkCheckpoint) >>>> >>>> Mingtao >>> >>> >>> >>> -- >>> -- Guozhang > > > > -- > -- Guozhang