With option 1) I can't really use 8 streams in each consumer, If I do only one consumer seem to be doing all work. So I had to actually use total 8 streams with 4 for each consumer.
On Fri, Aug 30, 2013 at 12:01 AM, Jun Rao <[email protected]> wrote: > The drawback of 2), as you said is no auto failover. I was suggesting that > you use 16 partitions. Then you can use option 1) with 8 streams in each > consumer. > > Thanks, > > Jun > > > On Thu, Aug 29, 2013 at 8:51 PM, Rajasekar Elango <[email protected] > >wrote: > > > Hi Jun, > > > > If you read my previous posts, based on current re balancing logic, if we > > consumer from topic filter, consumer actively use all streams. Can you > > provide your recommendation of option 1 vs option 2 in my previous post? > > > > Thanks, > > Raja. > > > > > > On Thu, Aug 29, 2013 at 11:42 PM, Jun Rao <[email protected]> wrote: > > > > > You can always use more partitions to get more parallelism in the > > > consumers. > > > > > > Thanks, > > > > > > Jun > > > > > > > > > On Thu, Aug 29, 2013 at 12:44 PM, Rajasekar Elango > > > <[email protected]>wrote: > > > > > > > So what is best way to load balance multiple consumers consuming from > > > topic > > > > filter. > > > > > > > > Let's say we have 4 topics with 8 partitions and 2 consumers. > > > > > > > > Option 1) To load balance consumers, we can set num.streams=4 so that > > > both > > > > consumers split 8 partitions. but can only use half of consumer > > streams. > > > > > > > > Option 2) Configure mutually exclusive topic filter regex such that 2 > > > > topics will match consumer1 and 2 topics will match consumer2. Now we > > can > > > > set num.streams=8 and fully utilize consumer streams. I believe this > > will > > > > improve performance, but if consumer dies, we will not get any data > > from > > > > the topic used by that consumer. > > > > > > > > What would be your recommendation? > > > > > > > > Thanks, > > > > Raja. > > > > > > > > > > > > On Thu, Aug 29, 2013 at 12:42 PM, Neha Narkhede < > > [email protected] > > > > >wrote: > > > > > > > > > >> 2) When I started mirrormaker with num.streams=16, looks like 16 > > > > > consumer > > > > > threads were created, but only 8 are showing up as active as owner > in > > > > > consumer offset tracker and all topics/partitions are distributed > > > > between 8 > > > > > consumer threads. > > > > > > > > > > This is because currently the consumer rebalancing process of > > assigning > > > > > partitions to consumer streams is at a per topic level. Unless you > > have > > > > at > > > > > least one topic with 16 partitions, the remaining 8 threads will > not > > do > > > > any > > > > > work. This is not ideal and we want to look into a better > rebalancing > > > > > algorithm. Though it is a big change and we prefer doing it as part > > of > > > > the > > > > > consumer client rewrite. > > > > > > > > > > Thanks, > > > > > Neha > > > > > > > > > > > > > > > On Thu, Aug 29, 2013 at 8:03 AM, Rajasekar Elango < > > > > [email protected] > > > > > >wrote: > > > > > > > > > > > So my understanding is num of active streams that a consumer can > > > > utilize > > > > > is > > > > > > number of partitions in topic. This is fine if we consumer from > > > > specific > > > > > > topic. But if we consumer from TopicFilter, I thought consumer > > should > > > > > able > > > > > > to utilize (number of topics that match filter * number of > > partitions > > > > in > > > > > > topic) . But looks like number of streams that consumer can use > is > > > > > limited > > > > > > by just number if partitions in topic although it's consuming > from > > > > > multiple > > > > > > topic. > > > > > > > > > > > > Here what I observed with 1 mirrormaker consuming from whitelist > > > '.+'. > > > > > > > > > > > > The white list matches 5 topics and each topic has 8 partitions. > I > > > used > > > > > > consumer offset checker to look at owner of each/topic partition. > > > > > > > > > > > > 1) When I started mirrormaker with num.streams=8, all > > > topics/partitions > > > > > are > > > > > > distributed between 8 consumer threads. > > > > > > > > > > > > 2) When I started mirrormaker with num.streams=16, looks like 16 > > > > consumer > > > > > > threads were created, but only 8 are showing up as active as > owner > > in > > > > > > consumer offset tracker and all topics/partitions are distributed > > > > > between 8 > > > > > > consumer threads. > > > > > > > > > > > > So this could be bottleneck for consumers as although we > > partitioned > > > > > topic, > > > > > > if we are consuming from topic filter it can't utilize much of > > > > > parallelism > > > > > > with num of streams. Am i missing something, is there a way to > make > > > > > > cosumers/mirrormakers to utilize more number of active streams? > > > > > > > > > > > > > > > > > > -- > > > > > > Thanks, > > > > > > Raja. > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Thanks, > > > > Raja. > > > > > > > > > > > > > > > -- > > Thanks, > > Raja. > > > -- Thanks, Raja.
