Hi Guozhang,

I think what I am looking for is the real 'randomness' when producer write
to the partitions. Based on my log, through a long time period, only one
partition got the write, while the other side, only one consumer is active.
In my case the consumer is slow, so when it comes back for the next
message, the whole pipeline is slowed down.

The 'round robin' works for my case. Is it Email a good thread to follow?
http://mail-archives.apache.org/mod_mbox/kafka-users/201312.mbox/%3CCANZY1Phu8_QZqHCw-ivGp=mxmzpze9+2vjxrlewzzebjor0...@mail.gmail.com%3E

Best Regards,
Mingtao


On Tue, Aug 12, 2014 at 7:55 AM, Mingtao Zhang <mail2ming...@gmail.com>
wrote:

> Hi Guozhang,
>
> Thank you!
>
> Could I say the consumer 'take turns to consume' is resulted by the
> correspond partition got the 'message write'?
>
> The problem I am facing is my 'enrichment' (getting more data based on raw
> data) consumer took too much time to complete one message consumption. To
> explore more parallel, could I say my only choice is 'decouple consumer
> consumption with enrichment'?
>
> Mingtao Sent from iPhone
>
> > On Aug 12, 2014, at 1:10 AM, Guozhang Wang <wangg...@gmail.com> wrote:
> >
> > Hello Mingtao,
> >
> > The partition will not be re-assigned to other consumers unless the
> current
> > consumer fails, so the behavior you described will not be expected.
> >
> > Guozhang
> >
> >
> > On Mon, Aug 11, 2014 at 6:27 PM, Mingtao Zhang <mail2ming...@gmail.com>
> > wrote:
> >
> >> Hi Guozhang,
> >>
> >> I do have another Email talking about Partitions per topic. I paste it
> >> within this Email.
> >>
> >> I am expecting those consumers will work concurrently. The behavior I
> >> observed here is consumer thread-1 will work a while, then thread-3 will
> >> work, then thread-0 ..., is it normal?
> >>
> >> version is 2.2.0.
> >>
> >> Best Regards,
> >> Mingtao
> >>
> >>> On Wed, Jul 23, 2014 at 7:57 PM, Guozhang Wang <wangg...@gmail.com>
> wrote:
> >>>
> >>> num.partitions is only used as a default value when the createTopic
> >> command
> >>> does not specify the num.partitions or it is automatically created. In
> >> your
> >>> case since you always use its value in the createTopic you will always
> >> can
> >>> one partition. Try change your code to sth. like:
> >>>
> >>>        String[] args = new String[]{
> >>>            "--zookeeper", config.getString("zookeeper"),
> >>>            "--topic", config.getString("topic"),
> >>>            "--replica", config.getString("replicas"),
> >>>            "--partition", "8"
> >>>        };
> >>>
> >>>        CreateTopicCommand.main(args);
> >>>
> >>>
> >>>
> >>> On Wed, Jul 23, 2014 at 4:38 PM, Mingtao Zhang <mail2ming...@gmail.com
> >
> >>> wrote:
> >>>
> >>>> Hi All,
> >>>>
> >>>> In kafka.properties, I put (forgot to change):
> >>>>
> >>>> num.partitions=1
> >>>>
> >>>> While I create topics programatically:
> >>>>
> >>>>        String[] args = new String[]{
> >>>>            "--zookeeper", config.getString("zookeeper"),
> >>>>            "--topic", config.getString("topic"),
> >>>>            "--replica", config.getString("replicas"),
> >>>>            "--partition", config.getString("partitions")
> >>>>        };
> >>>>
> >>>>        CreateTopicCommand.main(args);
> >>>>
> >>>> The performance engineer told me only one consumer thread is actively
> >>>> working even I have 4 consumer threads started (could see when
> >> debugging
> >>> or
> >>>> in thread dump); and 4 partitions configured from the args.
> >>>>
> >>>> It seems that num.partitions is still controlling the parallelism. Do
> I
> >>>> need to change this num.partitions accordingly? Could I remove it?
> What
> >>> is
> >>>> I have different parallel requirement for different topic?
> >>>>
> >>>> Thank you in advance!
> >>>>
> >>>> Best Regards,
> >>>> Mingtao
> >>
> >>
> >>> On Mon, Aug 11, 2014 at 7:37 PM, Guozhang Wang <wangg...@gmail.com>
> wrote:
> >>>
> >>> Mingtao,
> >>>
> >>> How many partitions of the consumed topic has? Basically the data is
> >>> distributed per-partition, and hence if the number of consumers is
> larger
> >>> than the number of partitions, some consumers will not get any data.
> >>>
> >>> Guozhang
> >>>
> >>>
> >>> On Mon, Aug 11, 2014 at 3:29 PM, Mingtao Zhang <mail2ming...@gmail.com
> >
> >>> wrote:
> >>>
> >>>> Is it anyhow related to the issue?
> >>>>
> >>>> WARN No previously checkpointed highwatermark value found for topic
> RAW
> >>>> partition 0. Returning 0 as the highwatermark
> >>>> (kafka.server.HighwaterMarkCheckpoint)
> >>>>
> >>>> Mingtao
> >>>
> >>>
> >>>
> >>> --
> >>> -- Guozhang
> >
> >
> >
> > --
> > -- Guozhang
>

Reply via email to