Yeah it will break the existing usage but personally I think it is worth it to be standard across all our tools.
-Jay On Fri, Feb 27, 2015 at 9:53 AM, Jiangjie Qin <j...@linkedin.com.invalid> wrote: > Hi Jay, > > I just modified the KIP. The only concern I have about this change is that > it will break existing deployments. And we need to change the command line > arguments format for other tools as well. It is defiitely better that we > conform to the unix standard. It is just I am not sure if the change worth > it given we have been using this argument format for a while. > > Jiangjie (Becket) Qin > > On 2/26/15, 8:40 PM, "Jay Kreps" <jay.kr...@gmail.com> wrote: > > >Can we change the command line arguments for mm to match the command line > >arguments elsewhere. This proposal seems to have two formats: > >*--consumer.rebalance.listener* > >and > >*--abortOnSendFail* > >The '.' separators for command line options predate this JIRA but I think > >the new camelCase option is a new invention. All the other command line > >tools, as well as pretty much all of unix uses dashes like this: > >*--consumer-rebalance-listener* > >I don't really know the history of tis but let's move it to normal unix > >dashes across the board as well as examine the options for any > other>inconsistencies. > > > >-Jay > > > > > >On Thu, Feb 26, 2015 at 11:57 AM, Jiangjie Qin <j...@linkedin.com.invalid > > > >wrote: > > > >> Hi Neha, > >> > >> Thanks for the comment. That’s a really good point. > >> > >> Originally I’m thinking about allowing user to tweak some parameter as > >> needed. > >> For example, some user might want to have ppeline enabled and can > >> tolerate reordering, some user might want to use acks=1 or acks=0, some > >> might want to move forward when error is encountered in callback. > >> So we don’t want to enforce all the settings of no.data.loss. Meanwhile > >>we > >> want to make the life easier for the users who want no data loss so they > >> don’t need to set the configs one by one, therefore we crated this > >>option. > >> > >> But as you suggested, we can probably make no.data.loss settings to be > >> default and removed the ―no.data.loss option, so if people want to tweak > >> the settngs, they can just change them, otherwise they get the default > >> no-data-loss settings. > >> > >> I’ll modify the KIP. > >> > >> Thanks. > >> > >> Jiangjie (Becket) Qin > >> > >> On 2/26/15, 8:58 AM, "Neha Narkhede" <n...@confluent.io> wrote: > >> > >> >Hey Becket, > >> > > >> >The KIP proposes addition of a --no.data.loss command line option to > >>the > >> >MirrorMaker. Though when would the uer not want that option? I'm > >> >wondering > >> >what the benefit of providing that option is if every user would want > >>that > >> >for correct mirroring behavior. > >> > > >> >Other than that, the KIP looks great! > >> > > >> >Thanks, > >> >Neha > >> > > >> >On Wed, Feb 25, 2015 at 3:56 PM, Jiangjie Qin > >><j...@linkedin.com.invalid> > >> >wrote: > >> > > >> >> For 1), the current design allow you to do it. The customizable > >>message > >> >> handler takes in a ConsumerRecord and spit a List<ProducerRecord>, > >>you > >> >>can > >> >> just put a topic for the ProducerRecord different from > >>ConsumerRecord. > >> >> > >> >> WRT performance, we did some test in LinkedIn, the performance looks > >> >>good > >> >> to us. > >> >> > >> >> Jiangjie (Becket) Qin > >> >> > >> >> On 2/25/15, 3:41 PM, "Bhavesh Mistry" <mistry.p.bhav...@gmail.com> > >> >>wrote: > >> >> > >> >> >Hi Jiangjie, > >> >> > > >> >> >It might be too late. But, I wanted to bring-up following use case > >>for > >> >> >adopting new MM: > >> >> > > >> >> >1) Ability to publish message rom src topic to different > >>destination > >> >> >topic > >> >> >via --overidenTopics=srcTopic:newDestinationTopic > >> >> > > >> >> >In order to adopt, new MM enhancement customer will compare > >> >>performance of > >> >> >new MM and data quality while running old MM against same > >>destination > >> >> >cluster in Prd. > >> >> > > >> >> >Let me know if you agree to that or not. Also, If yes, will be > >>able to > >> >> >able to provide this feature in release version. > >> >> > > >> >> >Thanks, > >> >> > > >> >> >Bhavesh > >> >> > > >> >> > > >> >> >On Tue, Feb 24, 2015 at 5:31 PM, Jiangjie Qin > >> >><j...@linkedin.com.invalid> > >> >> >wrote: > >> >> > > >> >> >> Sure! Just created the voting thread :) > >> >> >> > >> >> >> On 2/24/5, 4:44 PM, "Jay Kreps" <j...@confluent.io> wrote: > >> >> >> > >> >> >> >Hey Jiangjie, > >> >> >> > > >> >> >> >Let's do an official vote so that we know hat we are voting on > >>and > >> >>we > >> >> >>are > >> >> >> >crisp on what the outcome was. This thread is very long :- > >> >> >> > > >> >> >> >-Jay > >> >> >> > > >> >> >> >On Tue, Feb 24, 2015 at 2:53 PM, Jiangjie Qin > >> >> >><j...@linkedin.com.invalid> > >> >> >> >wrote: > >> >> >> > > >> >> >> >> I updated the KIP page based on the discussion we had. > >> >> >> >> > >> >> >> >> Should I launch another vote or we can think of this mail > >>thread > >> >>has > >> >> >> >> already included a vote? > >> >> >> >> > >> >> >> >> Jiangjie (Becket) Qin > >> >> >> >> > >> >> >> >> On 2/11/15, 5:15 PM, "Neha Nakhede" <n...@confuent.io> wrote: > >> >>>> >> > >> >> >> >> >Thanks for the explanation, Joel! Would love to see the > >>results > >> >>of > >> >> >>the > >> >> >> >> >throughput experiment and I'm a +1 on everything els, ncluding > >> >>the > >> >> >> >> >rebalance callback and record handler. > >> >> >> >> > > >> >> >> >> >-Neha > >> >> >> >> > > >> >> >> >> >On Wed, Feb 11, 2015 at 1:13 PM Jay Kreps > >><jay.kreps@gmailcom> > >> >> >>wrote: > >> >> >> >> > > >> >> >> >> >> Cool, I agree with all that. > >> >> >> >> >> > >> >> >> >> >> I agree about the need for a rebalancing callback. > >> >> >> >> >> > >> >> >> >> >> Totally agree about record handler. > >> >> > >> >> > >> >> >> >> >> It would be great to see if a prototype of this is workable. > >> >> >> >> >> > >> >> >> >> >> Thanks guys! > >> >> >> >> >> > >> >> >> >> >> -Jay > >> >> >> >> >> > >> >> >> >> >> On Wed, Feb 11 2015 at 12:36 PM, Joel Koshy > >> >><jjkosh...@gmail.com > >> >> > > >> >> >> >> >>wrote: > >> >> >> >> >> > >> >> >> >> >> > Hey Jay, > >> >> >> >> >> > > >> >> >> >> >> > Guozhang, Becket and I got together todiscus this and we > >> >> >>think: > >> >> >> >> >> > > >> >> >> >> >> > - It seems that your proposal based on the new consumr and > >> >>flush > >> >> >> >>call > >> >> >> >> >> > should work. > >> >> >> >> >> > - We would likely need to call the poll with a timeout > >>that > >> >> >>matches > >> >> >> >> >> > the offset commt interval in ordr to deal with low > >>volume > >> >> >> >> >> > mirroring pipelines. > >> >> >> >> >> > - We will still need a reblnce callback to educe > >> >>duplicates - > >> >> >> >>the > >> >> >> >> >> > rebalance callback would need to flush and ommit > >>offsets. > >> >> >> >> >> > - The only remaining question is if the overall > >>throughput is > >> >> >> >> >> > sufficient. I think someone at LinkedIn (I don't > >>remember > >> >>who) > >> >> >> >>did > >> >> >> >> >> > some experiments wth data channel size == 1 and ran > >>into > >> >> >>issues. > >> >> >> >> >> > That was not thoroughly investigated though. > >> >> >> >> >> > - The addition of flush may actually make this solution > >> >>viable > >> >> >>for > >> >> >> >>the > >> >> >> >> >> > current mirror-maker (wih the old consumer). We can > >> >>prototype > >> >> >> >>that > >> >> >> >> >> > offline and if it works out well we can redo KAFKA-1650 > >> >>(i.e., > >> >> >> >> >> > refactor the current mirror make). The flush call and > >>the > >> >>new > >> >> >> >> >> > consumer didn't exist at the time we did KAFKA-1650 so > >>this > >> >> >>did > >> >> >> >>not > >> >> >> >> >> > occur to us. > >> >> >> >> >> > - We think the RecordHandler is still a useful small > >>addition > >> >> >>for > >> >> >> >>the > >> >> >> >> >> > use-cases mentioned earlier in this thread. > >> >> >> >> >> > > >> >> >> >> >> > Thanks, > >> >> >> >> >> > > >> >> >> >> >> > Joel > >> >> >> >> >> > > >> >> >> >> >> > On Wed, Feb 11, 2015 at 09:05:39AM -0800, Jay Kreps wrote: > >> >> >> >> >> > > Guozhang, I agree with 1-3, I do think what I was > >>proposing > >> >> >>was > >> >> >> >> >>simpler > >> >> >> >> >> > but > >> >> >> >> >> > > perhaps there re gaps in that? > >> >> >> >> >> > > > >> >> >> >> >> > > Hey Joel--Here was a sketch of what I was proposing. I > >>do > >> >> >>think > >> >> >> >>this > >> >> >> >> >> > get's > >> >> >> >> >> > > rid of manual offset tracking, espcially doing so across > >> >> >>threads > >> >> >> >> >>with > >> >> >> >> >> > > dedicated commit threads, which I think is prety > >>complex. > >> >> >> >> >> > > > >> >> >> > >> > > while(true) { > >> >> >> >> >> > > val recs = consumer.poll(Long.MaxValue); > >> >> >> >> >> > > for (rec <- recs) > >> >> >> >> >> > > proucer.sd(rec, logErrorCallback) > >> >> >> >> >> > > if(System.currentTimeMillis - lastCommit > > >> >> >>commitInterval) { > >> >> >> >> >> > > producer.flush() > >> >> >> >> >> > > consumer.commit() > >> >> >> >> >> > > lastCommit = System.currentTimeMillis > >> >> >> >> >> > > } > >> >> >> >> >> > > } > >> >> >> >> >> > > > >> >> >> >> >> > > (See the previousemail for details). I think the > >>question > >> >> >>is: is > >> >> >> >> >>there > >> >> >> >> >> > any > >> >> >> >> >> > > reason--performance, correctness, etc--that this won't > >> >>work? > >> >> >> >> >>Basically > >> >> >> >> >> I > >> >> >> >> >> > > think you guys have thought about this more so I may be > >> >> >>missing > >> >> >> > >> > something. > >> >> >> >> >> >> If so let's flag it while we still have leeway on the > >> >> >>consumer. > >> >> >> >> >> > > > >> >> >> >> >> > > If we think that will work, well I do think it is > >> > >>conceptually a > >> >> >> >>lot > >> >> >> >> >> > > simpler than the current code, though I suppose one > >>could > >> >> >> >>disagree > >> >> >> >> >>on > >> >> >> >> >> > that. > >> >> >> >> >> > > > >> >> >> >> >> > > -Jay > >> >> >> >> >> > > > >> >> >> >> >> > > On Wed, Feb 11, 2015 at 5:53 AM, Joel Koshy > >> >> >><jjkosh...@gmail.com > >> >> >> > > >> >> >> >> >> wrote: > >> >> >> >> >> > > > >> >> >> > >> > > > Hi Jay, > >> >> >> >> >> > > > > >> >> >> >> >> > > > > The data channels are actually a big part of the > >> >> >>complexity > >> >> >> >>of > >> >> >> >> >>the > >> >> >> >> >> > zero > >> >> >> >> >> > > > > data loss design, though, right? Because then you > >>need > >> >> >>ome > >> >> >> >> >>reverse > >> >> >> >> >> > > > channel > >> >> >> >> >> > > > > to flo the acks back to the consumer based on where > >>you > >> >> >>are > >> >> >> >> >>versus > >> >> >> >> >> > just > >> >> >> >> >> > > > > acking what you have read and written (as in the > >>code > >> >> >> >>snippet I > >> >> >> >> >>put > >> >> >> >> >> > up). > >> >> >> >> >> > > >>> >> >> >> >> > > > I'm not sure if we are on the > same page. Even if the > >>data > >> >> >> >>channel > >> >> >> >> >>was > >> >> >> >> >> > > > not there the current handling fr zero data loss would > >> >> >>remain > >> >> >> >> >>very > >> >> > >> >> > > > similar - you would need to maintain lists of unacked > >> >>source > >> >> >> >> >>offsets. > >> >> >> >> >> > > > I'm wondering if the KIP needs more detail on how it > >>is > >> >> >> >>currently > > >> >> >> >> > > > implemented; or are suggesting a different approach > >>(in > >> >> >>which > >> >> >> >> >>case I > >> >> >> >> >> > > > have notfully understood). I'm not sure whatyou mean > >>by > >> >> >> >>flowing > >> >> >> >> >> acks > >> >> >> >> >> > > > back to the consumer - the MM commits offsets after > >>the > >> >> >> >>producer > >> >> >> >> >>ack > >> >> >> >> >> > > > has been received. There is some additional complexity > >> >> >> >>introduced > >> > >> >> >>in > >> >> >> >> >> > > > reducing duplicates on a rebalance - this is actually > >> >> >>optional > >> >> >> >> >>(since > >> >> >> >> >> > > > duplicates are currently a given). The reasonthat was > >> >>done > >> >> >> >> >>anyway is > >> >> >> >> >> > > > that with the auto-commit turned off duplicates are > >> >>almost > >> >> >> >> >>guaranteed > >> >> >> >> >> > > > on a rebalance. > >> >> >> >> >> > > > > >> >> >> >> >> > > > > I think the point that Neha and I were trying to > >>make > >> >>was > >> >> >> >>that > >> >> >> >> >>the > >> >> >> >> >> > > > > motivation to embed stuff into MM kindof is related > >>to > >> >> >>how > >> >> >> >> >> complex a > >> >> >> >> >> > > > > simple "consume and prouce" with good throughput > >>will > >> >> >>be. If > >> >> >> >> >>it is > >> >> >> >> >> > > > simple > >> >> >> >> >> > > > > to write such a thing in a few lines, the pain of > >> >> >>embedding a > >> >> >> >> >>bunch > >> >> >> >> >> > of > >> >> >> >> >> > > > > stuff won't be worth it, if it has to be as complex > >>as > >> >>the > >> >> >> >> >>current > >> >> >> >> >> mm > >> >> >> >> >> > > > then > >> >> >> >> >> > > > > of course we will need all kinds of plug ins > >>because no > >> >> >>one > >> >> >> >> >>will be > >> >> >> >> >> >able > >> >> >> >> >> > > > to > >> >> >> >> >> > > > > write such a thing. I don't have a huge concern > >>with a > >> >> >>simple > >> >> >> >> >> plug-in > >> >> >> >> >> > > > but I > >> >> >> >> >> > > > > think if i turns into something more complex with > >> >> >>filtering > >> >> >> >>and > >> >> >> >> >> > > > > aggregation or whatever we really need to stop and > >> >>think a > >> >> >> >>bit > >> >> >> >> >> about > >> >> >> >> >> > the > >> >> >> > >> > > > > design. > >> >> >> >> >> > > > > >> >> >> >> >> > > > I agree - I don't think there is a usecase for any > >> >>comple > >> >> >> >> >>plug-in. > >> >> >> >> >> > > > It is pretty much what Becket has described curently > >>for > >> >> >>the > >> >> >> >> >>mesage > >> >> >> >> >> > > > handler - i.e., take an incoming record and return a > >> >>list of > >> >> >> >> >>outgoing > >> >> >> >> >> > > > records (which could be empty if you filter). > >> >> >> >> >> > > > > >> >> >> >> >> > > > So here is my ake on the MM: > >> >> >> >> >> > > > - Bare bones: simple consumer - producer pairs (07 > >> >>style). > >> >> >> >>This > >> >> >> >> >>is > >> >> >> >> >> > > > ideal, but does not handle no data los > >> >> >> >> >> > > > - Above plus support no data loss. This actually adds > >> >>quite > >> >> >>a > >> >> >> >>bit > >> >> >> >> >>of > >> >> >> >> >> > > > complexity. > >> >> >> >> >> > > > - Above plus the message handler. This is a trivial > >> >> >>addition I > >> >> >> >> >>think > >> >> >> >> >> > > > that makes the MM usable in a few other > >>mirroring-like > >> >> >> >> >> applications. > >> >> >> >> >> > > > > >> >> >> >> >> > > > Joel > >> >> >> >> >> > > > > >> >> >> >> >> > > > > On Tue, Feb 10, 2015 at 12:31 PM, Joel Koshy > >> >> >> >> >><jjkosh...@gmail.com> > >> >> >> >> >> > > > wrote: > >> >> >> >> >> > > > > > >> >> >> >> >> > > > > > > >> >> >> >> >> > > > > > > >> >> >> >> >> > > > >> On Tue, Feb 10, 2015 at 12:13:46PM -0800, Neha > >> >>Narkhede > >> >> >> >>wrote: > >> >> >> >> >> > > > > > > I think all of us agree that we want to design > >> >> >> >>MirrorMaker > >> >> >> >> >>for > >> >> >> >> >> 0 > >> >> >> >> >> > data > >> >> >> >> >> > > > > > loss. > >> >> >> >> >> > > > > > > With the absence of the data channel, 0 data > >>loss > >> >> >>will be > >> >> >> >> >>much > >> >> >> >> >> > > > simpler to > >> >> >> >> >> > > > > > > implement. > >> >> >> >> >> > > > > > > >> > >> >> >> > > > > > The data channel is irrelevant to theimplementation > >> >>of > >> >> >> >>zero > >> >> >> >> >>data > >> >> >> >> >> > > > > > loss. The complexity in the implementation of no > >>data > >> >> >>loss > >> >> >> >> >>that > >> >> >> >> >> you > >> >> >> >> >> > > > > > are seeing in mirror-maker affects all > >> >> >>consume-then-produce > >> >> >> >> >> > patterns > >> >> >> >> >> > > > > > whether or not there is a data hannel. You still > >> >>need > >> >> >>to > >> >> >> >> >> > mantain a > >> >> >> >> >> > > > > > list of unacked offsets. What I meant earlier is > >> >>that we > >> >> >> >>can > >> >> >> >> >> > > > > > brainstorm completely different approaches to > >> >> >>supporting no > >> >> >> >> >>data > >> >> >> >> >> > loss, > >> >> >> >> >> > > > > > but the current implementation is the only > >>solution > >> >>we > >> >> >>ar > >> >> >> >> >>aware > >> >> >> >> >> > of. > >> >> >> >> >> > > > > > > >> >> >> >> >> > > > > > > > >> >> >> >> >> > > > > > > My arguments for adding a message handler are > >>that: > >> >> >> >> >> > > > > > > > 1. It is more efficient to do something in > >>common > >> >> >>for > >> >> >> >>all > >> >> >> >> >>the > >> >> >> >> >> > > > clients > >> >> >> >> >> > > > > > in > >> >> >> >> >> > > > > > > > pipeline than letting each client do thesame > >> >>thing > >> >> >>for > >> >> >> >> >>many > >> >> >> >> >> > > > times. And > >> >> >> >> >> > > > > > > > there are concrete use cases for the message > >> >>handler > >> >> >> >> >>already. > >> >> >> >> >> > > > > > > > > >> >> >> >> >> > > > > > > >> >> >> >> >> > > > > > > What are the concrete use cases? > >> >> >> >> >> > > > > > > >> >> >> >> >> > > > > > I think Becket alrady described a couple of use > >> >>cases > >> >> >> >> >>earlier in > >> >> >> >> >> > the > >> >> >> >> >> > > > > > thread. > >> >> >> >> >> > > > > > > >> >> >> >> >> > > > > > <quote> > >> >> >> >> >> > > > > > > >> >> >> >> >> > > > > > 1 Format conversion. We have a use case where > >> >>clients > >> >> >>of > >> >> >> >> >>source > >> >> >> >> >> > > > > > cluster > >> >> >> >> >> > > > > > use an internal schema and clients of target > >>cluster > >> >> >>use a > >> >> >> >> >> > different > >> >> >> >> >> > > > > > public schema. > >> >> >> >> >> > > > > > 2. Message filtering: For the messagespublished > >>to > >> >> >>source > >> >> >> >> >> cluster, > >> >> >> >> >> > > > > > there > >> >> >> >> >> > > > > > ar some messages private to source cluster clients > >> >>and > >> >> >> >>should > >> >> >> >> >> not > >> >> >> >> >> > > > > > exposed > >> >> >> >> >> > > > > > to target cluster clients. It would be difficult > >>to > >> >> >>publish > >> >> >> >> >>those > >> >> >> >> >> > > > > > messages > >> >> >> >> >> > > > > > into different partitions because they need to be > >> >> >>ordered. > >> >> >> >> >> > > > > > I agree that we can always filter/convert messages > >> >>after > >> >> >> >>they > >> >> >> >> >>are > >> >> >> >> >> > > > > > copied > >> >> >> >> >> > > > > > to thetarget cluster, but that costs network > >> >>bandwidth > >> >> >> >> >> > unnecessarily, > >> > >> >> >> > > > > > especially if that is a cross colo mirror. With the > >> >> >> >>handler, > >> >> >> >> >>we > >> >> >> >> >> can > >> >> >> >> >> > > > > > co-locate the mirror maker with source cluster and > >> >>save > >> >> >> >>that > >> >> >> >> >> cost. > >> >> >> >> >> > > > > > Also, > >> >> >> >> >> > > > > > imagine there are many downstream consumers > >>consuming > >> >> >>from > >> >> >> >>the > >> >> >> >> >> > target > >> >> >> >> >> > > > > > cluster, filtering/reformatting the messages > >>before > >> >>the > >> >> >> >> >>messages > >> >> > >> >> > reach > >> >> > >> >> > > > > > te > >> >> >> >> >> > > > > > target cluster is much more efficient than having > >> >>each > >> >> >>of > >> >> >> >>the > >> >> >> >> >> > > > > > consumers do > >> >> >> >> >> > > > > > this individually on their own. > >> >> >> >> >> > > > > > > >> >> >> >> >> > > > > > </quote> > >> >> >> >> >> > > > > > > >> >> >> >> >> > > > > > > > >> >> >> >> >> > > > > > > Also the KIP still refers to he datachannel in a > >> >>few > >> >> >> >> >>places > >> >> >> >> >> > > > (Motivation > >> >> >> >> >> > > > > > > and "On consumer rebalance" sections). Can you > >> >>update > >> >> >>the > >> >> >> >> >>wiki > >> >> >> >> >> > so it > >> >> >> >> >> > > > is > >> >> >> >> >> > > > > > > easier to review the new design, espeially the > >> >>data > >> >> >>loss > >> >> >> >> >>part. > >> >> >> >> >> > > > > > > > >> >> >> >> >> > > > > > > > >> >> >> >> >> >> > > > > On Tue, Feb 10, 2015 at 10:36 AM, Joel Koshy < > >> >> >> >> >> > jjkosh...@gmail.com> > >> >> >> >> >> > > > > > wrote: > >> >> >> >> >> > > > > > > > >> >> >> >> >> > > > > > > > I think the message handler adds little to > >>no>> > >> >> >>complexity > >> >> >> >> >>to > >> >> >> >> >> the > >> >> >> >> >> > > > mirror > >> >> >> >> >> > > > > > > > maker. Jay/Neha, the MM became scary due to > >>the > >> >> >> >> >> rearchitecture > >> >> >> >> >> > we > >> >> >> >> >> > > > did > >> >> >> >> >> > > > > > > > for 0.8 due to performance issues compared > >>with > >> >>0.7 > >> >> >>- > >> >> >> >>we > >> >> >> >> >> should > >> >> >> >> >> > > > remove > >> >> >> >> >> > > > > > > > the data channel if it can match the current > >> >> >> >>throughput. I > >> >> >> >> >> > agree > >> >> >> >> >> > > > it is > >> >> >> >> >> > > > > > > worth prototyping and testing that so the MM > >> >> >> >>architecture > >> >> >> >> >>is > >> >> >> >> >> > > > > > > > simplified. > >> >> >> >> >> > > > > > > > >> >> >> >> >> > > > > > > > The MM became a little scarier in KAFKA-1650 > >>in > >> >> >>order > >> > >> >>to > >> >> >> >> >> > support no > >> >> >> >> >> > > > > > > > data loss. I think the implementation for no > >>data > >> >> >>loss > >> >> >> >> >>will > >> >> >> >> >> > remain > >> >> >> >> >> > > > > > > > about the same even in the new model (even > >> >>without > >> >> >>the > >> >> >> >> >>data > >> >> >> >> >> > > > channel) - > >> >> >> >> >> > > > > > > > we can probably brainstorm more if there is a > >> >> >> >> >>better/simpler > >> >> >> >> >> > way > >> >> >> >> >> > > > to do > >> >> >> >> >> > > > > > > > it (maybe there is in the absence of the data > >> >> >>channel) > >> >> >> >> >>but at > >> >> >> >> >> > the > >> >> >> >> >> > > > time > >> >> >> >> >> > > > > > > > it was the best we (i.e., Becket, myself, Jun > >>and > >> >> >> >>Guozhang > >> >> >> >> >> who > >> >> >> >> >> > > > > > > > participated on the review) could come up > >>with. > >> >> >> >> >> > > > > > > > > >> >> >> >> >> > > > > > > > So I'm definitely +1 on whatever it takes to > >> >> >>support no > >> >> >> >> >>data > >> >> >> >> >> lss. > >> >> >> >> >> > > > I > >> >> >> >> >> > > > > > > > think most people would want that out of the > >>box. > >> >> >> >> >> > > > > > > > > >> >> >> >> >> > > > > > > > As for the message handler, as Becket wrote > >>and I > >> >> >>agree > >> >> >> >> >>with, > >> >> >> >> >> > it is > >> >> >> >> >> > > > > > > > really a trivial addition that would benefit > >> >> >>(perhaps > >> >> >> >>not > >> >> >> >> >> most, > >> >> >> >> >> > > > but at > >> >> >> >> >> > > > > > > > least some). So I'm personally +1 on that as > >> >>well. > >> >> >>That > >> >> >> >> >>said, > >> >> >> >> >> > I'm > >> >> >> >> >> > > > also > >> >> >> >> >> > > > > > > > okay with it not being there. I think the MM > >>is > >> >> >>fairly > >> >> >> >> >> > stand-alone > >> >> >> >> >> > > > and > >> >> >> >> >> > > > > > > > simple eough that it is entirely reasonable > >>and > >> >> >> >> >>absolutely > >> >> >> >> >> > > > feasible > >> >> >> >> >> > > > > > > > or companies to fork/re-implement the mirror > >> >>maker > >> >> >>for > >> >> >> >> >>their > >> >> >> >> >> > own > >> >> >> >> >> > > > > > > > needs. > >> >> >> >> >> > > > > > > > > >> >> >> >> >> > > > > > > > So in summary, I'm +1 on the KIP. > >> >> >> >> >> > > > > > > > > >> >> >> >> >> > > > > > > > Thanks, > >> >> >> >> >> > > > > > > > > >> >> >> >> >> > > > > > > > Joel > >> >> >> >> >> > > > > > > > > >> >> >> >> >> > > > > > > > On Mon, Feb 09, 2015 at 09:19:57PM +0000, > >> >>Jiangjie > >> >> >>Qin > >> >> >> >> >>wrote: > >> >> >> >> >> > > > > > > > > I just updated the KIP page and incorporated > >> >>Jay > >> >> >>and > >> >> >> >> >>Neha’s > >> >> >> >> >> > > > > > suggestion. > >> >> >> >> >> > > > > > > > As > >> >> >> >> >> > > > > > > > > a brief summay of where we are: > >> >> >> >> >> > > > > > > > > > >> >> >> >> >> > > > > > > > > Consensus reached: > >> >> >> >> >> > > > > > > > > Have N independent mirror maker threads each > >> >>has > >> >> >> >>their > >> >> >> >> >>own > >> >> >> >> >> > > > consumers > >> >> >> >> >> > > > > > but > >> >> >> >> >> > > > > > > > > share a producer. The mirror maker threads > >> >>will be > >> > >> >> >> > responsible > >> >> >> >> >> > > > for > >> >> >> >> >> > > > > > > > > decompression, compression and offset commit > >> >>No > >> >> >>data > >> >> >> >> >> > channel and > >> >> >> >> >> > > > > > > > separate > >> >> >> >> >> > > > > > > > > offset commit thread is needed. Consumer > >> >>rebalance > >> >> >> >> >>callback > >> >> >> >> >> > will > >> >> >> >> >> > > > be > >> >> >> >> >> > > > > > used > >> >> >> >> >> > > > > > > > > to avoid duplicates on rebalance. > >> >> >> >> >> > > > > > > > > > >> >> >>>> >> > > > > > > > > Still under discussion: > >> >> >> >> >> > > > > > > > > Whether message handler is needed. > >> >> >> >> >> > >> > > > > > > >> >> >> >> >> > > > > > > > > My arguments for adding a message handler > >>are > >> >> >>that: > >> >> >> >> >> > > > > > > > > 1. It is more efficient to do something in > >> >>common > >> >> >>for > >> >> >> >> >>all > >> >> >> >> >> the > >> >> >> >> >> > > > > > clients in > >> >> >> >> >> > > > > > > > > pipeline than letting each client do the > >>same > >> >> >>thing > >> >> >> >>for > >> >> >> >> >> many > >> >> >> >> >> > > > times. > >>>> >> >> >> > > > > > And > >> >> >> >> >> > > > > > > > > there are concrete use cases for the message > >> >> >>handler > >> >> >> >> >> already. > >> >> >> >> >> > > > > > > > > 2. It is not a big complicated add-on to > >>mirror > >> >> >> >>maker. > >> >> >> >> >> > > > > > > > > 3.Without a message handler, for customers > >> >>needs > >> >> >>it, > >> >> >> >> >>they > >> >> >> >> >> > have > >> >> >> >> >> > > > to > >> >> >> >> >> > > > > > > > > re-implement all the logics of mirror maker > >>by > >> >> >> >> >>themselves > >> >> >> >> >> > just in > >> >> >> >> >> > > > > > order > >> >> >> >> >> > > > > > > > to > >> >> >> >> >> > > > > > > > > add tis handling in pipeline. > >> >> >> >> >> > > > > > > > > > >> >> >> >> >> > > > > > > > > Any thoughts? > >> >> >> >> >> > > > > > > > > > >> >> >> >> >> > > > > > > > > Thanks. > >> >> >> >> >> > > > > > > > > > >> >> >> >> >> > > > > > > > > ―Jiangjie (Becket) Qin > >> >> >> >> >> > > > > > > > > > >> >> >> >> >> > > > > > > > > On 2/8/15, 6:35 PM, "Jiangjie Qin" > >> >> >> >>j...@linkedin.com> > >> >> >> >> >> > wrote: > >> >> >> >> >> > > > > > > > > > >> >> >> >> >> > > > > > > > > >Hi Jay, thanks a lot for the comments. > >> >> >> >> >> > > > > > > > > >I think this solution is better. We > >>probably > >> >> >>don’t > >> >> >> >>need > >> >> >> >> >> data > >> >> >> >> >> > > > channel > >> >> >> >> >> > > > > > > > > >anymore. It canbe replaced with a list of > >> >> >>producer > >> >>>> >>if > >> >> >> >> >>we > >> >> >> >> >> > need > >> >> >> >> >> > > > more > >> >> >> >> >> > > > > > > > sender > >> >> >> >> >> > > > > > > > > >thread. > >> >> >> >> >> > > > > > > > > >I’l update the KIP page. > >> >> >> >> >> > > > > > > > > > > >> >> >>>> >> > > > > > > > > >The reasoning about message handler is > >>mainly > >> >>for > >> >> >> >> >> efficiency > >> >> >> >> >> > > > > > purpose. > >> >> >> >> >> > > > > > > > I’m > >> >> >> >> >> > > > > > > > > >thinking that if something can be done in > >> >> >>pipeline > >> >> >> >>for > >> >> >> >> >>all > >> >> >> >> >> > the > >> >> >> >> >> > > > > > clients > >> >> >> >> >> > > > > > > > > >such as filtering/reformatting, it is > >>probably > >> >> >> >>better > >> >> >> >> >>to > >> >> >> >> >> do > >> >> >> >> >> > it > >> >> > >> >> > > > in > >> >> >> >> >> > > > > > the > >> >> >> >> >> > > > > > > > >pipeline than asking 100 clients do the same > >> >> >>thing > >> >> >> >>for > >> >> >> >> >>100 > >> >> >> >> >> > > > times. > >> >> >> >> >> > > > > > > > > > > >> >> >> >> >> > > > > > > > > >―Jiangjie (Becket) Qin > >> >> >> >> >> > > > > > > > > > > >> >> >> >> >> > > > > > > > > > > >> >> >> >> >> > > > > > > > > >On 2/8/15, 4:59 PM, "Jay Kreps" > >> >> >> >><jay.kr...@gmail.co> > >> >> >> >> >> > wrote: > >> >> >> >> >> > > > > > > > > > > >> >> >> >> >> > > > > > > > > >>Yeah, I second Neha's comments. The > >>current > >> >mm > >> >> >>code > >> >> >> >> >>has > >> >> >> >> >> > taken > >> >> >> >> >> > > > > > something > >> >> >> >> >> > > > > > > > > >>pretty simple and made it pretty scary > >>with > >> >> >> >>callbacs > >> >> >> >> >>and > >> >> >> >> >> > > > > > wait/notify > >> >> >> >> >> > > > > > > > > >>stuff. Do we believe this works? I cant > >> >>tell by > >> >> >> >> >>looking > >> >> >> >> > > at it > >> >> >> >> >> > > > > > which is > >> >> >> >> >> > > > > > > > > >>kind of bad for something important like > >> >>this. I > >> >> >> >>don't > >> >> >> >> >> mean > >> >> >> >> >> > > > this as > >> >> >> >> >>> > > > > > > > >>criticism, I know the history: we added in > >> >> >>memory > >> >> >> >> >>queus > >> >> >> >> >> to > >> >> >> >> >> > > > help > >> >> >> >> >> > > > > > with > >> >> >> >> >> > > > > > > > > >>other > >> >> >> >> >> > > > > > > > > >>performance problems without thinking > >>about > >> >> >> >> >>correctness, > >> >> >> >> >> > then > >> >> >> >> >> > > > we > >> >> >> >> >> > > > > > added > >> >> >> >> >> > > > > > > > > >>stuff to work around the in-memory queues > >>not > >> >> >>lose > >> >> >> >> >>data, > >> >> >> >> >> > and > >> >> >> >> >> > > > so on. > >> >> >> >> >> > > > > > > > > >> > >> >> >> >> >> > > > > > > > > >>Can we instead do the pposite exercise and > >> >> >>start > >> >> >> >>with > >> >> >> >> >> the > >> >> >> >> >> > > > basics > >> >> >> >> >> > > > > > of > >> >> >> >> >> > > > > > > > what > >> >> >> >> >> > > > > > > > > >>mm should do and think about what > >> >>deficiencies > >> >> >> >> >>prevents > >> >> >> >> >> > this > >> >> >> >> >> > > > > > approach > >> >> >> >> >> > > > > > > > > >>from > >> >> >> >> >> > > > > > > > > >>working? Then let's make sure the > >>currently > >> >> >> >>in-flight > >> >> >> >> >> work > >> >> >> >> >> > will > >> >> >> >> >> > > > > > remove > >> >> >> >> >> > > > > > > > > >>these deficiencies. After all mm is kind > >>of > >> >>the > >> >> >> >> >> > prototypical > >> >> >> >> >> > > > kafka > >> >> >> >> >> > > > > > use > >> >> >> >> >> > > > > > > > > >>case > >> >> >> >> >> > > > > > > > > >>so if we can't make our clients to this > >> >> >>probably no > >> >> >> >> >>one > >> >> >> >> >> > else > >> >> >> >> >> > > > can. > >> >> >> >> >> > > > > > > > > >> > >> >> >> >> >> > > > > > > > > >>I think mm should just be N independent > >> >>threads > >> >> >> >>each > >> >> >> >> >>of > >> >> >> >> >> > which > >> >> >> >> >> > > > has > >> >> >> >> >> > > > > > their > >> >> >> >> >> > > > > > > > > >>own > >> >> >> >> >> > > > > > > > > >>consumer but share a producer and each of > >> >>which > >> >> >> >>looks > >> >> >> >> >> like > >> >> >> >> >> > > > this: > >> >> >> >> >> > > > > > > > > >> > >> >> >> >> >> > > > > > > > > >>while(true) { > >> >> >> >> >> > > > > > > > > >> val recs = > >>consumer.poll(Long.MaxValue); > >> >> >> >> >> > > > > > > > > >> for (rec <- recs) > >> >> >> >> >> > > > > > > > > >> producer.send(rec, > >>logErrorCallback) > >> >> >> >> >> > > > > > > > > >> if(System.currentTimeMillis - > >>lastCommit > >> >>> > >> >> >> >> >> > commitInterval) > >> >> >> >> >> > > { > >> >> >> >> >> > > > > > > > > >> producer.flush() > >> >> >> >> >> > > > > > > > > >> consumer.commit() > >> >> >> >> >> > > > > > > > > >> lastCommit = > >>System.currentTimeMillis > >> >> >> >> >> > > > > > > > > >> } > >> >> >> >> >> > > > > > > > > >>} > >> >> >> >> >> > > > > > > > > >> > >> >> >> >> >> > > > > > > > > >>This will depend on setting the retry > >>count > >> >>in > >> >> >>the > >> >> >> >> >> > producer to > >> >> >> >> >> > > > > > > > something > >> >> >> >> >> > > > > > > > > >>high with a largish backoff so that a > >>failed > >> >> >>send > >> >> >> >> >>attempt > >> >> >> >> >> > > > doesn't > >> >> >> >> >> > > > > > drop > >> >> >> >> >> > > > > > > > > >>data. > >> >> >> >> >> > > > > > > > > >> > >> >> >> >> >> > > > > > > > > >>We will need to use the callback to force > >>a > >> >> >>flush > >> >> >> >>and > >> >> >> >> >> > offset > >> >> >> >> >> > > > > > commit on > >> >> >> >> >> > > > > > > > > >>rebalance. > >> >> >> >> >> > > > > > > > > >> > >> >> >> >> >> > > > > > > > > >>This approach may have a few more TCP > >> >> >>connections > >> >> >> >>due > >> >> >> >> >>to > >> >> >> >> >> > using > >> >> >> >> >> > > > > > multiple > >> >> >> >> >> > > > > > > > > >>consumers but I think it is a lot easier > >>to > >> >> >>reason > >> >> >> >> >>about > >> >> >> >> >> > and > >> >> >> >> >> > > > the > >> >> >> >> >> > > > > > total > >> >> >> >> >> > > > > > > > > >>number of mminstances is always going to > >>be > >> >> >>small. > >> >> >> >> >> > > > > > > > > >> > >> >> >> >> >> > > > > > > > > >>Let's talk about where this simple > >>approach > >> >> >>falls > >> >> >> >> >>short, > >> >> >> >> >> I > >> >> >> >> >> > > > think > >> >> >> >> >> > > > > > that > >> >> >> >> >> > > > > > > > >>will > >> >> >> >> >> > > > > > > > > >>help us understand your motivations for > >> >> >>additional > >> >> >> >> >> > elements. > >> >> >> >> >> > > > > > > > >> > >> >> >> >> >> > > > > > > > > >>Another advantage of this is that it is so > >> >> >>simple I > >> >> >> >> >>don't > >> >> >> >> >> > > > think we > >> >> >> >> >> > > > > > > > really > >> >> >> >> >> > > > > > > > > >>even need to both making mm extensible > >> >>because > >> >> >> >>writing > >> >> >> >> >> > your own > >> >> >> >> >> > > > > > code > >> >> >> >> >> > > > > > > > that > >> >> >> >> >> > > > > > > > > >>does custom processing or transformation > >>is > >> >>just > >> >> >> >>ten > >> >> >> >> >> lines > >> >> >> >> >> > and > >> >> >> >> >> > > > no > >> >> >> >> >> > > > > > plug > >> >> >> >> >> > > > > > > > in > >> >> >> >> >> > > > > > > > > >>system is going to make it simpler. > >> >> >> >> >> > > > > > > > > >> > >> >> >> >> >> > > > > > > > > >>-Jay > >> >> >> >> >> > > > > > > > > >> > >> >> >> >> >> > > > > > > > > >> > >> >> >> >> >> > > > > > > > > >>On Sun, Feb 8, 2015 at 2:40 PM, Neha > >> >>Narkhede < > >> >> >> >> >> > > > n...@confluent.io> > >> >> >> >> >> > > > > > > > wrote: > >> >> >> >> >> > > > > > > > > >> > >> >> >> >> >> > > > > > > > > >>> Few comments - > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> > > > > > > > > >>> 1. Why do we need the message handler? > >>Do > >> >>you > >> >> >> >>have > >> >> >> >> >> > concrete > >> >> >> >> >> > > > use > >> >> >> >> >> > > > > > cases > >> >> >> >> >> > > > > > > > > >>>in > >> >> >> >> >> > > > > > > > > >>> mind? If not, we should consider adding > >>it > >> >>in > >> >> >>the > >> >> >> >> >> future > >> >> >> >> >> > > > when/if > >> >> >> >> >> > > > > > we > >> >> >> >> >> > > > > > > > do > >> >> >> >> >> > > > > > > > > >>>have > >> >> >> >> >> > > > > > > > > >>> use cases for it. The purpose of the > >>mirror > >> >> >>maker > >> >> >> >> >>is a > >> >> >> >> >> > simple > >> >> >> >> >> > > > > > tool > >> >> >> >> >> > > > > > > > for > >> >> >> >> >> > > > > > > > > >>> setting up Kafka cluster replicas. I > >>don't > >> >>see > >> >> >> >>why > >> >> >> >> >>we > >> >> >> >> >> > need to > >> >> >> >> >> > > > > > > > include a > >> >> >> >> >> > > > > > > > > >>> message handler for doing stream > >> >> >>transformations > >> >> >> >>or > >> >> >> >> >> > > > filtering. > >> >> >> >> >> > > > > > You > >> >> >> >> >> > > > > > > > can > >> >> >> >> >> > > > > > > > > >>> always write a simple process for doing > >> >>that > >> >> >>once > >> >> >> >> >>the > >> >> >> >> >> > data is > >> >> >> >> >> > > > > > copied > >> >> >> >> >> > > > > > > > as > >> >> >> >> >> > > > > > > > > >>>is > >> >> >> >> >> > > > > > > > > >>> in the target cluster > >> >> >> >> >> > > > > > > > > >>> 2. Why keep both designs? We should > >>prefer > >> >>the > >> >> >> >> >>simpler > >> >> >> >> >> > design > >> >> >> >> >> > > > > > unless > >> >> >> >> >> > > > > > > > it > >> >> >> >> >> > > > > > > > > >>>is > >> >> >> >> >> > > > > > > > > >>> not feasible due to the performance > >>issue > >> >> >>that we > >> >> >> >> >> > previously > >> >> >> >> >> > > > > > had. Did > >> >> >> >> >> > > > > > > > > >>>you > >> >> >> >> >> > > > > > > > > >>> get a chance to run some tests to see if > >> >>that > >> >> >>is > >> >> >> >> >>really > >> >> >> >> >> > > > still a > >> >> >> >> >> > > > > > > > problem > >> >> >> >> >> > > > > > > > > >>>or > >> >> >> >> >> > > > > > > > > >>> not? It will be easier to think about > >>the > >> >> >>design > >> >> >> >>and > >> >> >> >> >> also > >> >> >> >> >> > > > make > >> >> >> >> >> > > > > > the > >> >> >> >> >> > > > > > > > KIP > >> >> >> >> >> > > > > > > > > >>> complete if we make a call on the design > >> >> >>first. > >> >> >> >> >> > > > > > > > > >>> 3. Can you explain the need for keeping > >>a > >> >> >>list of > >> >> >> >> >> unacked > >> >> >> >> >> > > > > > offsets per > >> >> >> >> >> > > > > > > > > >>> partition? Consider adding a section on > >> >> >>retries > >> >> >> >>and > >> >> >> >> >>how > >> >> >> >> >> > you > >> >> >> >> >> > > > plan > >> >> >> >> >> > > > > > to > >> >> >> >> >> > > > > > > > > >>>handle > >> >> >> >> >> > > > > > > > > >>> the case when the producer runs out of > >>all > >> >> >> >>retries. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> > > > > > > > > >>> Thanks, > >> >> >> >> >> > > > > > > > > >>> Neha > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> > > > > > > > > >>> On Sun, Feb 8, 2015 at 2:06 PM, Jiangjie > >> >>Qin > >> >> >> >> >> > > > > > > > > >>><j...@linkedin.com.invalid> > >> >> >> >> >> > > > > > > > > >>> wrote: > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> > > > > > > > > >>> > Hi Neha, > >> >> >> >> >> > > > > > > > > >>> > > >> >> >> >> >> > > > > > > > > >>> > Yes, I’ve updated the KIP so the > >>entire > >> >>KIP > >> >> >>is > >> >> >> >> >>based > >> >> >> >> >> > on new > >> >> >> >> >> > > > > > > > consumer > >> >> >> >> >> > > > > > > > > >>>now. > >> >> >> >> >> > > > > > > > > >>> > I’ve put both designs with and without > >> >>data > >> >> >> >> >>channel > >> >> >> >> >> in > >> >> >> >> >> > the > >> >> >> >> >> > > > KIP > >> >> >> >> >> > > > > > as I > >> >> >> >> >> > > > > > > > > >>>still > >> >> >> >> >> > > > > > > > > >>> > feel we might need the data channel to > >> >> >>provide > >> >> >> >> >>more > >> >> >> >> >> > > > > > flexibility, > >> >> >> >> >> > > > > > > > > >>> > especially after message handler is > >> >> >>introduced. > >> >> >> >> >>I’ve > >> >> >> >> >> > put my > >> >> >> >> >> > > > > > > > thinking > >> >> >> >> >> > > > > > > > > >>>of > >> >> >> >> >> > > > > > > > > >>> > the pros and cons of the two designs > >>in > >> >>the > >> >> >> >>KIP as > >> >> >> >> >> > well. > >> >> >> >> >> > > > It’ll > >> >> >> >> >> > > > > > be > >> >> >> >> >> > > > > > > > > >>>great > >> >> >> >> >> > > > > > > > > >>> if > >> >> >> >> >> > > > > > > > > >>> > you can give a review and comment. > >> >> >> >> >> > > > > > > > > >>> > > >> >> >> >> >> > > > > > > > > >>> > Thanks. > >> >> >> >> >> > > > > > > > > >>> > > >> >> >> >> >> > > > > > > > > >>> > Jiangjie (Becket) Qin > >> >> >> >> >> > > > > > > > > >>> > > >> >> >> >> >> > > > > > > > > >>> > On 2/6/15, 7:30 PM, "Neha Narkhede" < > >> >> >> >> >> n...@confluent.io > >> >> >> >> >> > > > >> >> >> >> >> > > > wrote: > >> >> >> >> >> > > > > > > > > >>> > > >> >> >> >> >> > > > > > > > > >>> > >Hey Becket, > >> >> >> >> >> > > > > > > > > >>> > > > >> >> >> >> >> > > > > > > > > >>> > >What are the next steps on this KIP. > >>As > >> >>per > >> >> >> >>your > >> >> >> >> >> > comment > >> >> >> >> >> > > > > > earlier > >> >> >> >> >> > > > > > > > on > >> >> >> >> >> > > > > > > > > >>>the > >> >> >> >> >> > > > > > > > > >>> > >thread - > >> >> >> >> >> > > > > > > > > >>> > > > >> >> >> >> >> > > > > > > > > >>> > >I do agree it makes more sense > >> >> >> >> >> > > > > > > > > >>> > >> to avoid duplicate effort and plan > >> >>based > >> >> >>on > >> >> >> >>new > >> >> >> >> >> > > > consumer. > >> >> >> >> >> > > > > > I’ll > >> >> >> >> >> > > > > > > > > >>>modify > >> >> >> >> >> > > > > > > > > >>> > >>the > >> >> >> >> >> > > > > > > > > >>> > >> KIP. > >> >> >> >> >> > > > > > > > > >>> > > > >> >> >> >> >> > > > > > > > > >>> > > > >> >> >> >> >> > > > > > > > > >>> > >Did you get a chance to think about > >>the > >> >> >> >> >>simplified > >> >> >> >> >> > design > >> >> >> >> >> > > > > > that we > >> >> >> >> >> > > > > > > > > >>> proposed > >> >> >> >> >> > > > > > > > > >>> > >earlier? Do you plan to update the > >>KIP > >> >>with > >> >> >> >>that > >> >> >> >> >> > proposal? > >> >> >> >> >> > > > > > > > > >>> > > > >> >> >> >> >> > > > > > > > > >>> > >Thanks, > >> >> >> >> >> > > > > > > > > >>> > >Neha > >> >> >> >> >> > > > > > > > > >>> > > > >> >> >> >> >> > > > > > > > > >>> > >On Wed, Feb 4, 2015 at 12:12 PM, > >> >>Jiangjie > >> >> >>Qin > >> >> >> >> >> > > > > > > > > >>><j...@linkedin.com.invalid > >> >> >> >> >> > > > > > > > > >>> > > >> >> >> >> >> > > > > > > > > >>> > >wrote: > >> >> >> >> >> > > > > > > > > >>> > > > >> >> >> >> >> > > > > > > > > >>> > >> In mirror maker we do not do > >> >> >> >>de-serialization > >> >> >> >> >>on > >> >> >> >> >> the > >> >> >> >> >> > > > > > messages. > >> >> >> >> >> > > > > > > > > >>>Mirror > >> >> >> >> >> > > > > > > > > >>> > >> maker use source TopicPartition > >>hash > >> >>to > >> >> >> >>chose a > >> >> >> >> >> > > > producer to > >> >> >> >> >> > > > > > send > >> >> >> >> >> > > > > > > > > >>> > >>messages > >> >> >> >> >> > > > > > > > > >>> > >> from the same source partition. The > >> >> >> >>partition > >> >> >> >> >> those > >> >> >> >> >> > > > > > messages end > >> >> >> >> >> > > > > > > > > >>>up > >> >> >> >> >> > > > > > > > > >>> with > >> >> >> >> >> > > > > > > > > >>> > >> are decided by Partitioner class in > >> >> >> >> >>KafkaProducer > >> >> >> >> >> > > > (assuming > >> >> >> >> >> > > > > > you > >> >> >> >> >> > > > > > > > > >>>are > >> >> >> >> >> > > > > > > > > >>> > >>using > >> >> >> >> >> > > > > > > > > >>> > >> the new producer), which uses hash > >> >>code > >> >> >>of > >> >> >> >> >> bytes[]. > >> >> >> >> >> > > > > > > > > >>> > >> > >> >> >> >> >> > > > > > > > > >>> > >> If deserialization is needed, it > >>has > >> >>to > >> >> >>be > >> >> >> >> >>done in > >> >> >> >> >> > > > message > >> >> >> >> >> > > > > > > > > >>>handler. > >> >> >> >> >> > > > > > > > > >>> > >> > >> >> >> >> >> > > > > > > > > >>> > >> Thanks. > >> >> >> >> >> > > > > > > > > >>> > >> > >> >> >> >> >> > > > > > > > > >>> > >> Jiangjie (Becket) Qin > >> >> >> >> >> > > > > > > > > >>> > >> > >> >> >> >> >> > > > > > > > > >>> > >> On 2/4/15, 11:33 AM, "Bhavesh > >>Mistry" > >> >>< > >> >> >> >> >> > > > > > > > mistry.p.bhav...@gmail.com> > >> >> >> >> >> > > > > > > > > >>> > >>wrote: > >> >> >> >> >> > > > > > > > > >>> > >> > >> >> >> >> >> > > > > > > > > >>> > >> >Hi Jiangjie, > >> >> >> >> >> > > > > > > > > >>> > >> > > >> >> >> >> >> > > > > > > > > >>> > >> >Thanks for entertaining my > >>question > >> >>so > >> >> >>far. > >> >> >> >> >>Last > >> >> >> >> >> > > > > > question, I > >> >> >> >> >> > > > > > > > > >>>have is > >> >> >> >> >> > > > > > > > > >>> > >> >about > >> >> >> >> >> > > > > > > > > >>> > >> >serialization of message key. If > >>the > >> >> >>key > >> >> >> >> >> > > > de-serialization > >> >> >> >> >> > > > > > > > > >>>(Class) is > >> >> >> >> >> > > > > > > > > >>> > >>not > >> >> >> >> >> > > > > > > > > >>> > >> >present at the MM instance, then > >> >>does it > >> >> >> >>use > >> >> >> >> >>raw > >> >> >> >> >> > byte > >> >> >> >> >> > > > > > hashcode > >> >> >> >> >> > > > > > > > to > >> >> >> >> >> > > > > > > > > >>> > >> >determine > >> >> >> >> >> > > > > > > > > >>> > >> >the partition ? How are you > >>going to > >> >> >> >>address > >> >> >> >> >>the > >> >> >> >> >> > > > situation > >> >> >> >> >> > > > > > > > where > >> >> >> >> >> > > > > > > > > >>>key > >> >> >> >> >> > > > > > > > > >>> > >> >needs > >> >> >> >> >> > > > > > > > > >>> > >> >to be de-serialization and get > >>actual > >> >> >> >>hashcode > >> >> >> >> >> > needs > >> >> >> >> >> > > > to be > >> >> >> >> >> > > > > > > > > >>>computed > >> >> >> >> >> > > > > > > > > >>> ?. > >> >> >> >> >> > > > > > > > > >>> > >> > > >> >> >> >> >> > > > > > > > > >>> > >> > > >> >> >> >> >> > > > > > > > > >>> > >> >Thanks, > >> >> >> >> >> > > > > > > > > >>> > >> > > >> >> >> >> >> > > > > > > > > >>> > >> >Bhavesh > >> >> >> >> >> > > > > > > > > >>> > >> > > >> >> >> >> >> > > > > > > > > >>> > >> >On Fri, Jan 30, 2015 at 1:41 PM, > >> >> >>Jiangjie > >> >> >> >>Qin > >> >> >> >> >> > > > > > > > > >>> > >><j...@linkedin.com.invalid> > >> >> >> >> >> > > > > > > > > >>> > >> >wrote: > >> >> >> >> >> > > > > > > > > >>> > >> > > >> >> >> >> >> > > > > > > > > >>> > >> >> Hi Bhavesh, > >> >> >> >> >> > > > > > > > > >>> > >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> Please see inline comments. > >> >> >> >> >> > > > > > > > > >>> > >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> Jiangjie (Becket) Qin > >> >> >> >> >> > > > > > > > > >>> > >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> On 1/29/15, 7:00 PM, "Bhavesh > >> >>Mistry" > >> >> >> >> >> > > > > > > > > >>><mistry.p.bhav...@gmail.com> > >> >> >> >> >> > > > > > > > > >>> > >> >>wrote: > >> >> >> >> >> > > > > > > > > >>> > >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >Hi Jiangjie, > >> >> >> >> >> > > > > > > > > >>> > >> >> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >Thanks for the input. > >> >> >> >> >> > > > > > > > > >>> > >> >> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >a) Is MM will producer ack > >>will > >> >>be > >> >> >> >>attach > >> >> >> >> >>to > >> >> >> >> >> > > > Producer > >> >> >> >> >> > > > > > > > > >>>Instance or > >> >> >> >> >> > > > > > > > > >>> > >>per > >> >> >> >> >> > > > > > > > > >>> > >> >> >topic. Use case is that one > >> >>instance > >> >> >> >>of MM > >> >> >> >> >> > > > > > > > > >>> > >> >> >needs to handle both strong ack > >> >>and > >> >> >>also > >> >> >> >> >>ack=0 > >> >> >> >> >> > for > >> >> >> >> >> > > > some > >> >> >> >> >> > > > > > > > topic. > >> >> >> >> >> > > > > > > > > >>> Or > >> >> >> >> >> > > > > > > > > >>> > >>it > >> >> >> >> >> > > > > > > > > >>> > >> >> >would > >> >> >> >> >> > > > > > > > > >>> > >> >> >be better to set-up another > >> >>instance > >> >> >>of > >> >> >> >>MM. > >> >> >> >> >> > > > > > > > > >>> > >> >> The acks setting is producer > >>level > >> >> >> >>setting > >> >> >> >> >> > instead of > >> >> >> >> >> > > > > > topic > >> >> >> >> >> > > > > > > > > >>>level > >> >> >> >> >> > > > > > > > > >>> > >> >>setting. > >> >> >> >> >> > > > > > > > > >>> > >> >> In this case you probably need > >>to > >> >>set > >> >> >>up > >> >> >> >> >> another > >> >> >> >> >> > > > > > instance. > >> >> >> >> >> > > > > > > > > >>> > >> >> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >b) Regarding TCP connections, > >>Why > >> >> >>does > >> >> >> >> >> #producer > >> >> >> >> >> > > > > > instance > >> >> >> >> >> > > > > > > > > >>>attach > >> >> >> >> >> > > > > > > > > >>> to > >> >> >> >> >> > > > > > > > > >>> > >>TCP > >> >> >> >> >> > > > > > > > > >>> > >> >> >connection. Is it possible to > >>use > >> >> >> >>Broker > >> >> >> >> >> > > > Connection TCP > >> >> >> >> >> > > > > > > > Pool, > >> >> >> >> >> > > > > > > > > >>> > >>producer > >> >> >> >> >> > > > > > > > > >>> > >> >> >will just checkout TCP > >>connection > >> >> to > >> >> >> >> >>Broker. > >> >> >> >> >> > So, > >> >> >> >> >> > > > # of > >> >> >> >> >> > > > > > > > > >>>Producer > >> >> >> >> >> > > > > > > > > >>> > >> >>Instance > >> >> >> >> >> > > > > > > > > >>> > >> >> >does not correlation to Brokers > >> >> >> >>Connection. > >> >> >> >> >> Is > >> >> >> >> >> > this > >> >> >> >> >> > > > > > > > possible > >> >> >> >> >> > > > > > > > > >>>? > >> >> >> >> >> > > > > > > > > >>> > >> >> In new producer, each producer > >> >> >>maintains > >> >> >> >>a > >> >> >> >> >> > > > connection to > >> >> >> >> >> > > > > > each > >> >> >> >> >> > > > > > > > > >>> broker > >> >> >> >> >> > > > > > > > > >>> > >> >> within the producer instance. > >> >>Making > >> >> >> >> >>producer > >> >> >> >> >> > > > instances > >> >> >> >> >> > > > > > to > >> >> >> >> >> > > > > > > > > >>>share > >> >> >> >> >> > > > > > > > > >>> the > >> >> >> >> >> > > > > > > > > >>> > >>TCP > >> >> >> >> >> > > > > > > > > >>> > >> >> connections is a very big > >>change to > >> >> >>the > >> >> >> >> >>current > >> >> >> >> >> > > > design, > >> >> >> >> >> > > > > > so I > >> >> >> >> >> > > > > > > > > >>> suppose > >> >> >> >> >> > > > > > > > > >>> > >>we > >> >> >> >> >> > > > > > > > > >>> > >> >> won’t be able to do that. > >> >> >> >> >> > > > > > > > > >>> > >> >> > > >> >> >> >> >> > > > > > > > > >>> > >> >> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >Thanks, > >> >> >> >> >> > > > > > > > > >>> > >> >> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >Bhavesh > >> >> >> >> >> > > > > > > > > >>> > >> >> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >On Thu, Jan 29, 2015 at 11:50 > >>AM, > >> >> >> >>Jiangjie > >> >> >> >> >>Qin > >> >> >> >> >> > > > > > > > > >>> > >> >><j...@linkedin.com.invalid > >> >> >> >> >> > > > > > > > > >>> > >> >> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >wrote: > >> >> >> >> >> > > > > > > > > >>> > >> >> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> Hi Bhavesh, > >> >> >> >> >> > > > > > > > > >>> > >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> I think it is the right > >> >>discussion > >> >> >>to > >> >> >> >> >>have > >> >> >> >> >> > when > >> >> >> >> >> > > > we are > >> >> >> >> >> > > > > > > > > >>>talking > >> >> >> >> >> > > > > > > > > >>> > >>about > >> >> >> >> >> > > > > > > > > >>> > >> >>the > >> >> >> >> >> > > > > > > > > >>> > >> >> >> new new design for MM. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> Please see the inline > >>comments. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> Jiangjie (Becket) Qin > >> >> >> >> >> > > > > > > > > >>> > >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> On 1/28/15, 10:48 PM, > >>"Bhavesh > >> >> >>Mistry" > >> >> >> >> >> > > > > > > > > >>> > >><mistry.p.bhav...@gmail.com> > >> >> >> >> >> > > > > > > > > >>> > >> >> >>wrote: > >> >> >> >> >> > > > > > > > > >>> > >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >Hi Jiangjie, > >> >> >> >> >> > > > > > > > > >>> > >> >> >> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >I just wanted to let you > >>know > >> >> >>about > >> >> >> >>our > >> >> >> >> >>use > >> >> >> >> >> > case > >> >> >> >> >> > > > and > >> >> >> >> >> > > > > > > > stress > >> >> >> >> >> > > > > > > > > >>>the > >> >> >> >> >> > > > > > > > > >>> > >> >>point > >> >> >> >> >> > > > > > > > > >>> > >> >> >>that > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >local data center broker > >> >>cluster > >> >> >>have > >> >> >> >> >>fewer > >> >> >> >> >> > > > > > partitions > >> >> >> >> >> > > > > > > > than > >> >> >> >> >> > > > > > > > > >>>the > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >destination > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >offline broker cluster. Just > >> >> >>because > >> >> >> >>we > >> >> >> >> >>do > >> >> >> >> >> > the > >> >> >> >> >> > > > batch > >> >> >> >> >> > > > > > pull > >> >> >> >> >> > > > > > > > > >>>from > >> >> >> >> >> > > > > > > > > >>> > >>CAMUS > >> >> >> >> >> > > > > > > > > >>> > >> >> >>and > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >in > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >order to drain data faster > >>than > >> >> >>the > >> >> >> >> >> injection > >> >> >> >> >> > > > rate > >> >> >> >> >> > > > > > (from > >> >> >> >> >> > > > > > > > > >>>four > >> >> >> >> >> > > > > > > > > >>> DCs > >> >> >> >> >> > > > > > > > > >>> > >> >>for > >> >> >> >> >> > > > > > > > > >>> > >> >> >>same > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >topic). > >> >> >> >> >> > > > > > > > > >>> > >> >> >> Keeping the same partition > >> >>number > >> >> >>in > >> >> >> >> >>source > >> >> >> >> >> > and > >> >> >> >> >> > > > target > >> >> >> >> >> > > > > > > > > >>>cluster > >> >> >> >> >> > > > > > > > > >>> > >>will > >> >> >> >> >> > > > > > > > > >>> > >> >>be > >> >> >> >> >> > > > > > > > > >>> > >> >> >>an > >> >> >> >> >> > > > > > > > > >>> > >> >> >> option but will not be > >>enforced > >> >>by > >> >> >> >> >>default. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >We are facing following > >>issues > >> >> >> >>(probably > >> >> >> >> >> due > >> >> >> >> >> > to > >> >> >> >> >> > > > > > > > > >>>configuration): > >> >> >> >> >> > > > > > > > > >>> > >> >> >> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >1) We occasionally > >>loose > >> >>data > >> >> >> >>due > >> >> >> >> >>to > >> >> >> >> >> > message > >> >> >> >> >> > > > > > batch > >> >> >> >> >> > > > > > > > > >>>size is > >> >> >> >> >> > > > > > > > > >>> > >>too > >> >> >> >> >> > > > > > > > > >>> > >> >> >>large > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >(2MB) on target data (we are > >> >>using > >> >> >> >>old > >> >> >> >> >> > producer > >> >> >> >> >> > > > but I > >> >> >> >> >> > > > > > > > think > >> >> >> >> >> > > > > > > > > >>>new > >> >> >> >> >> > > > > > > > > >>> > >> >> >>producer > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >will solve this problem to > >>some > >> >> >> >>extend). > >> >> >> >> >> > > > > > > > > >>> > >> >> >> We do see this issue in > >> >>LinkedIn as > >> >> >> >>well. > >> >> >> >> >> New > >> >> >> >> >> > > > producer > >> >> >> >> >> > > > > > > > also > >> >> >> >> >> > > > > > > > > >>> might > >> >> >> >> >> > > > > > > > > >>> > >> >>have > >> >> >> >> >> > > > > > > > > >>> > >> >> >> this issue. There are some > >> >> >>proposal of > >> >> >> >> >> > solutions, > >> >> >> >> >> > > > but > >> >> >> >> >> > > > > > no > >> >> >> >> >> > > > > > > > > >>>real > >> >> >> >> >> > > > > > > > > >>> work > >> >> >> >> >> > > > > > > > > >>> > >> >> >>started > >> >> >> >> >> > > > > > > > > >>> > >> >> >> yet. For now, as a > >>workaround, > >> >> >> >>setting a > >> >> >> >> >> more > >> >> >> >> >> > > > > > aggressive > >> >> >> >> >> > > > > > > > > >>>batch > >> >> >> >> >> > > > > > > > > >>> > >>size > >> >> >> >> >> > > > > > > > > >>> > >> >>on > >> >> >> >> >> > > > > > > > > >>> > >> >> >> producer side should work. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >2) Since only one > >> >>instance is > >> >> >> >>set > >> >> >> >> >>to > >> >> >> >> >> MM > >> >> >> >> >> > > > data, > >> >> >> >> >> > > > > > we > >> >> >> >> >> > > > > > > > are > >> >> >> >> >> > > > > > > > > >>>not > >> >> >> >> >> > > > > > > > > >>> > >>able > >> >> >> >> >> > > > > > > > > >>> > >> >>to > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >set-up ack per topic instead > >> >>ack > >> >> >>is > >> >> >> >> >> attached > >> >> >> >> >> > to > >> >> >> >> >> > > > > > producer > >> >> >> >> >> > > > > > > > > >>> > >>instance. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> I don’t quite get the > >>question > >> >> >>here. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >3) How are you going to > >> >> >>address > >> >> >> >>two > >> >> >> >> >> > phase > >> >> >> >> >> > > > commit > >> >> >> >> >> > > > > > > > > >>>problem > >> >> >> >> >> > > > > > > > > >>> if > >> >> >> >> >> > > > > > > > > >>> > >> >>ack is > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >set > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >to strongest, but auto > >>commit > >> >>is > >> >> >>on > >> >> >> >>for > >> >> >> >> >> > consumer > >> >> >> >> >> > > > > > (meaning > >> >> >> >> >> > > > > > > > > >>> > >>producer > >> >> >> >> >> > > > > > > > > >>> > >> >>does > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >not > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >get ack, but consumer auto > >> >> >>committed > >> >> >> >> >> offset > >> >> >> >> >> > that > >> >> >> >> >> > > > > > > > message). > >> >> >> >> >> > > > > > > > > >>> Is > >> >> >> >> >> > > > > > > > > >>> > >> >>there > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >transactional (Kafka > >> >>transaction > >> >> >>is > >> >> >> >>in > >> >> >> >> >> > process) > >> >> >> >> >> > > > > > based ack > >> >> >> >> >> > > > > > > > > >>>and > >> >> >> >> >> > > > > > > > > >>> > >>commit > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >offset > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >? > >> >> >> >> >> > > > > > > > > >>> > >> >> >> Auto offset commit should be > >> >>turned > >> >> >> >>off > >> >> >> >> >>in > >> >> >> >> >> > this > >> >> >> >> >> > > > case. > >> >> >> >> >> > > > > > The > >> >> >> >> >> > > > > > > > > >>>offset > >> >> >> >> >> > > > > > > > > >>> > >>will > >> >> >> >> >> > > > > > > > > >>> > >> >> >>only > >> >> >> >> >> > > > > > > > > >>> > >> >> >> be committed once by the > >>offset > >> >> >>commit > >> >> >> >> >> > thread. So > >> >> >> >> >> > > > > > there is > >> >> >> >> >> > > > > > > > > >>>no > >> >> >> >> >> > > > > > > > > >>> two > >> >> >> >> >> > > > > > > > > >>> > >> >>phase > >> >> >> >> >> > > > > > > > > >>> > >> >> >> commit. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >4) How are you > >>planning to > >> >> >>avoid > >> >> >> >> >> > duplicated > >> >> >> >> >> > > > > > message? > >> >> >> >> >> > > > > > > > > >>>( Is > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >brokergoing > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >have moving window of > >>message > >> >> >> >>collected > >> >> >> >> >>and > >> >> >> >> >> > > > de-dupe > >> >> >> >> >> > > > > > ?) > >> >> >> >> >> > > > > > > > > >>> > >>Possibly, we > >> >> >> >> >> > > > > > > > > >>> > >> >> >>get > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >this from retry set to 5…? > >> >> >> >> >> > > > > > > > > >>> > >> >> >> We are not trying to > >>completely > >> >> >>avoid > >> >> >> >> >> > duplicates. > >> >> >> >> >> > > > The > >> >> >> >> >> > > > > > > > > >>>duplicates > >> >> >> >> >> > > > > > > > > >>> > >>will > >> >> >> >> >> > > > > > > > > >>> > >> >> >> still be there if: > >> >> >> >> >> > > > > > > > > >>> > >> >> >> 1. Producer retries on > >>failure. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> 2. Mirror maker is hard > >>killed. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> Currently, dedup is expected > >>to > >> >>be > >> >> >> >>done > >> >> >> >> >>by > >> >> >> >> >> > user if > >> >> >> >> >> > > > > > > > > >>>necessary. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >5) Last, is there any > >> >> >>warning or > >> >> >> >> >>any > >> >> >> >> >> > thing > >> >> >> >> >> > > > you > >> >> >> >> >> > > > > > can > >> >> >> >> >> > > > > > > > > >>>provide > >> >> >> >> >> > > > > > > > > >>> > >> >>insight > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >from MM component about data > >> >> >> >>injection > >> >> >> >> >>rate > >> >> >> >> >> > into > >> >> >> >> >> > > > > > > > > >>>destination > >> >> >> >> >> > > > > > > > > >>> > >> >> >>partitions is > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >NOT evenly distributed > >> >>regardless > >> >> >> of > >> >> >> >> >> keyed > >> >> >> >> >> > or > >> >> >> >> >> > > > > > non-keyed > >> >> >> >> >> > > > > > > > > >>> message > >> >> >> >> >> > > > > > > > > >>> > >> >> >>(Hence > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >there is ripple effect such > >>as > >> >> >>data > >> >> >> >>not > >> >> >> >> >> > arriving > >> >> >> >> >> > > > > > late, or > >> >> >> >> >> > > > > > > > > >>>data > >> >> >> >> >> > > > > > > > > >>> is > >> >> >> >> >> > > > > > > > > >>> > >> >> >>arriving > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >out of order in intern of > >>time > >> >> >>stamp > >> >> >> >> >>and > >> >> >> >> >> > early > >> >> >> >> >> > > > some > >> >> >> >> >> > > > > > > > time, > >> >> >> >> >> > > > > > > > > >>>and > >> >> >> >> >> > > > > > > > > >>> > >> >>CAMUS > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >creates huge number of file > >> >>count > >> >> >>on > >> >> >> >> >>HDFS > >> >> >> >> >> > due to > >> >> >> >> >> > > > > > uneven > >> >> >> >> >> > > > > > > > > >>> injection > >> >> >> >> >> > > > > > > > > >>> > >> >>rate > >> >> >> >> >> > > > > > > > > >>> > >> >> >>. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >Camus Job is configured to > >>run > >> >> >> >>every 3 > >> >> >> >> >> > minutes.) > >> >> >> >> >> > > > > > > > > >>> > >> >> >> I think uneven data > >> >>distribution is > >> >> >> >> >> typically > >> >> >> >> >> > > > caused > >> >> >> >> >> > > > > > by > >> >> >> >> >> > > > > > > > > >>>server > >> >> >> >> >> > > > > > > > > >>> > >>side > >> >> >> >> >> > > > > > > > > >>> > >> >> >> unbalance, instead of > >>something > >> >> >>mirror > >> >> >> >> >>maker > >> >> >> >> >> > could > >> >> >> >> >> > > > > > > > control. > >> >> >> >> >> > > > > > > > > >>>In > >> >> >> >> >> > > > > > > > > >>> new > >> >> >> >> >> > > > > > > > > >>> > >> >> >>mirror > >> >> >> >> >> > > > > > > > > >>> > >> >> >> maker, however, there is a > >> >> >> >>customizable > >> >> >> >> >> > message > >> >> >> >> >> > > > > > handler, > >> >> >> >> >> > > > > > > > > >>>that > >> >> >> >> >> > > > > > > > > >>> > >>might > >> >> >> >> >> > > > > > > > > >>> > >> >>be > >> >> >> >> >> > > > > > > > > >>> > >> >> >> able to help a little bit. In > >> >> >>message > >> >> >> >> >> handler, > >> >> >> >> >> > > > you can > >> >> >> >> >> > > > > > > > > >>> explicitly > >> >> >> >> >> > > > > > > > > >>> > >> >>set a > >> >> >> >> >> > > > > > > > > >>> > >> >> >> partition that you want to > >> >>produce > >> >> >>the > >> >> >> >> >> message > >> >> >> >> >> > > > to. So > >> >> >> >> >> > > > > > if > >> >> >> >> >> > > > > > > > you > >> >> >> >> >> > > > > > > > > >>> know > >> >> >> >> >> > > > > > > > > >>> > >>the > >> >> >> >> >> > > > > > > > > >>> > >> >> >> uneven data distribution in > >> >>target > >> >> >> >> >>cluster, > >> >> >> >> >> > you > >> >> >> >> >> > > > may > >> >> >> >> >> > > > > > offset > >> >> >> >> >> > > > > > > > > >>>it > >> >> >> >> >> > > > > > > > > >>> > >>here. > >> >> >> >> >> > > > > > > > > >>> > >> >>But > >> >> >> >> >> > > > > > > > > >>> > >> >> >> that probably only works for > >> >> >>non-keyed > >> >> >> >> >> > messages. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >I am not sure if this is > >>right > >> >> >> >> >>discussion > >> >> >> >> >> > form to > >> >> >> >> >> > > > > > bring > >> >> >> >> >> > > > > > > > > >>>these > >> >> >> >> >> > > > > > > > > >>> to > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >your/kafka > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >Dev team attention. This > >> >>might be > >> >> >> >>off > >> >> >> >> >> track, > >> >> >> >> >> > > > > > > > > >>> > >> >> >> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >Thanks, > >> >> >> >> >> > > > > > > > > >>> > >> >> >> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >Bhavesh > >> >> >> >> >> > > > > > > > > >>> > >> >> >> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >On Wed, Jan 28, 2015 at > >>11:07 > >> >>AM, > >> >> >> >> >>Jiangjie > >> >> >> >> >> > Qin > >> >> >> >> >> > > > > > > > > >>> > >> >> >><j...@linkedin.com.invalid > >> >> >> >> >> > > > > > > > > >>> > >> >> >> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >wrote: > >> >> >> >> >> > > > > > > > > >>> > >> >> >> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> I’ve updated the KIP page. > >> >> >> >>Feedbacks > >> >> >> >> >>are > >> >> >> >> >> > > > welcome. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> Regarding the simple > >>mirror > >> >> >>maker > >> >> >> >> >> design. I > >> >> >> >> >> > > > thought > >> >> >> >> >> > > > > > > > over > >> >> >> >> >> > > > > > > > > >>>it > >> >> >> >> >> > > > > > > > > >>> and > >> >> >> >> >> > > > > > > > > >>> > >> >>have > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>some > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> worries: > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> There are two things that > >> >>might > >> >> >> >>worth > >> >> >> >> >> > thinking: > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> 1. One of the enhancement > >>to > >> >> >>mirror > >> >> >> >> >>maker > >> >> >> >> >> > is > >> >> >> >> >> > > > > > adding a > >> >> >> >> >> > > > > > > > > >>>message > >> >> >> >> >> > > > > > > > > >>> > >> >> >>handler to > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> do things like > >>reformatting. > >> >>I > >> >> >> >>think > >> >> >> >> >>we > >> >> >> >> >> > might > >> >> >> >> >> > > > > > > > potentially > >> >> >> >> >> > > > > > > > > >>> want > >> >> >> >> >> > > > > > > > > >>> > >>to > >> >> >> >> >> > > > > > > > > >>> > >> >> >>have > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> more threads processing > >>the > >> >> >> >>messages > >> >> >> >> >>than > >> >> >> >> >> > the > >> >> >> >> >> > > > > > number of > >> >> >> >> >> > > > > > > > > >>> > >>consumers. > >> >> >> >> >> > > > > > > > > >>> > >> >> >>If we > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> follow the simple mirror > >> >>maker > >> >> >> >> >>solution, > >> >> >> >> >> we > >> >> >> >> >> > > > lose > >> >> >> >> >> > > > > > this > >> >> >> >> >> > > > > > > > > >>> > >>flexibility. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> 2. This might not matter > >>too > >> >> >>much, > >> >> >> >>but > >> >> >> >> >> > creating > >> >> >> >> >> > > > > > more > >> >> >> >> >> > > > > > > > > >>> consumers > >> >> >> >> >> > > > > > > > > >>> > >> >>means > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>more > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> footprint of TCP > >>connection / > >> >> >> >>memory. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> Any thoughts on this? > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> Thanks. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> Jiangjie (Becket) Qin > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> On 1/26/15, 10:35 AM, > >> >>"Jiangjie > >> >> >> >>Qin" < > >> >> >> >> >> > > > > > > > j...@linkedin.com> > >> >> >> >> >> > > > > > > > > >>> > wrote: > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >Hi Jay and Neha, > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >Thanks a lot for the > >>reply > >> >>and > >> >> >> >> >> > explanation. I > >> >> >> >> >> > > > do > >> >> >> >> >> > > > > > agree > >> >> >> >> >> > > > > > > > > >>>it > >> >> >> >> >> > > > > > > > > >>> > >>makes > >> >> >> >> >> > > > > > > > > >>> > >> >>more > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>sense > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >to avoid duplicate effort > >> >>and > >> >> >>plan > >> >> >> >> >>based > >> >> >> >> >> > on > >> >> >> >> >> > > > new > >> >> >> >> >> > > > > > > > > >>>consumer. > >> >> >> >> >> > > > > > > > > >>> I’ll > >> >> >> >> >> > > > > > > > > >>> > >> >> >>modify > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>the > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >KIP. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >To Jay’s question on > >>message > >> >> >> >> >>ordering - > >> >> >> >> >> > The > >> >> >> >> >> > > > data > >> >> >> >> >> > > > > > > > channel > >> >> >> >> >> > > > > > > > > >>> > >> >>selection > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>makes > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >sure that the messages > >>from > >> >>the > >> >> >> >>same > >> >> >> >> >> > source > >> >> >> >> >> > > > > > partition > >> >> >> >> >> > > > > > > > > >>>will > >> >> >> >> >> > > > > > > > > >>> > >>sent > >> >> >> >> >> > > > > > > > > >>> > >> >>by > >> >> >> >> >> > > > > > > > > >>> > >> >> >>the > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >same producer. So the > >>order > >> >>of > >> >> >>the > >> >> >> >> >> > messages is > >> >> >> >> >> > > > > > > > > >>>guaranteed > >> >> >> >> >> > > > > > > > > >>> with > >> >> >> >> >> > > > > > > > > >>> > >> >> >>proper > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >producer settings > >> >> >> >> >> > > > > > > > > >>> > >> > >> >> >> >> >>>>(MaxInFlightRequests=1,retries=Integer.MaxValue, > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>etc.) > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >For keyed messages, > >>because > >> >> >>they > >> >> >> >>come > >> >> >> >> >> > from the > >> >> >> >> >> > > > > > same > >> >> >> >> >> > > > > > > > > >>>source > >> >> >> >> >> > > > > > > > > >>> > >> >>partition > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>and > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >will end up in the same > >> >>target > >> >> >> >> >> partition, > >> >> >> >> >> > as > >> >> >> >> >> > > > long > >> >> >> >> >> > > > > > as > >> >> >> >> >> > > > > > > > > >>>they > >> >> >> >> >> > > > > > > > > >>> are > >> >> >> >> >> > > > > > > > > >>> > >> >>sent > >> >> >> >> >> > > > > > > > > >>> > >> >> >>by > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>the > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >same producer, the order > >>is > >> >> >> >> >>guaranteed. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >For non-keyed messages, > >>the > >> >> >> >>messages > >> >> >> >> >> > coming > >> >> >> >> >> > > > from > >> >> >> >> >> > > > > > the > >> >> >> >> >> > > > > > > > > >>>same > >> >> >> >> >> > > > > > > > > >>> > >>source > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>partition > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >might go to different > >>target > >> >> >> >> >>partitions. > >> >> >> >> >> > The > >> >> >> >> >> > > > > > order is > >> >> >> >> >> > > > > > > > > >>>only > >> >> >> >> >> > > > > > > > > >>> > >> >> >>guaranteed > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >within each partition. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >Anyway, I’ll modify the > >>KIP > >> >>and > >> >> >> >>data > >> >> >> >> >> > channel > >> >> >> >> >> > > > will > >> >> >> >> >> > > > > > be > >> >> >> >> >> > > > > > > > > >>>away. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >Thanks. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >Jiangjie (Becket) Qin > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >On 1/25/15, 4:34 PM, > >>"Neha > >> >> >> >>Narkhede" > >> >> >> >> >>< > >> >> >> >> >> > > > > > > > n...@confluent.io> > >> >> >> >> >> > > > > > > > > >>> > >>wrote: > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>I think there is some > >> >>value in > >> >> >> >> >> > investigating > >> >> >> >> >> > > > if > >> >> >> >> >> > > > > > we > >> >> >> >> >> > > > > > > > can > >> >> >> >> >> > > > > > > > > >>>go > >> >> >> >> >> > > > > > > > > >>> > >>back > >> >> >> >> >> > > > > > > > > >>> > >> >>to > >> >> >> >> >> > > > > > > > > >>> > >> >> >>the > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>simple mirror maker > >> >>design, as > >> >> >> >>Jay > >> >> >> >> >> points > >> >> >> >> >> > > > out. > >> >> >> >> >> > > > > > Here > >> >> >> >> >> > > > > > > > you > >> >> >> >> >> > > > > > > > > >>> have > >> >> >> >> >> > > > > > > > > >>> > >>N > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>threads, > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>each has a consumer and > >>a > >> >> >> >>producer. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>The reason why we had to > >> >>move > >> >> >> >>away > >> >> >> >> >>from > >> >> >> >> >> > that > >> >> >> >> >> > > > was > >> >> >> >> >> > > > > > a > >> >> >> >> >> > > > > > > > > >>> > >>combination > >> >> >> >> >> > > > > > > > > >>> > >> >>of > >> >> >> >> >> > > > > > > > > >>> > >> >> >>the > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>difference in throughput > >> >> >>between > >> >> >> >>the > >> >> >> >> >> > consumer > >> >> >> >> >> > > > > > and the > >> >> >> >> >> > > > > > > > > >>>old > >> >> >> >> >> > > > > > > > > >>> > >> >>producer > >> >> >> >> >> > > > > > > > > >>> > >> >> >>and > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>the > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>deficiency of the > >>consumer > >> >> >> >> >>rebalancing > >> >> >> >> >> > that > >> >> >> >> >> > > > > > limits > >> >> >> >> >> > > > > > > > the > >> >> >> >> >> > > > > > > > > >>> total > >> >> >> >> >> > > > > > > > > >>> > >> >> >>number of > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>mirror maker threads. So > >> >>the > >> >> >>only > >> >> >> >> >> option > >> >> >> >> >> > > > > > available > >> >> >> >> >> > > > > > > > was > >> >> >> >> >> > > > > > > > > >>>to > >> >> >> >> >> > > > > > > > > >>> > >> >>increase > >> >> >> >> >> > > > > > > > > >>> > >> >> >>the > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>throughput of the > >>limited > >> >># of > >> >> >> >> >>mirror > >> >> >> >> >> > maker > >> >> >> >> >> > > > > > threads > >> >> >> >> >> > > > > > > > > >>>that > >> >> >> >> >> > > > > > > > > >>> > >>could > >> >> >> >> >> > > > > > > > > >>> > >> >>be > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>deployed. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>Now that queuing design > >>may > >> >> >>not > >> >> >> >>make > >> >> >> >> >> > sense, > >> >> >> >> >> > > > if > >> >> >> >> >> > > > > > the > >> >> >> >> >> > > > > > > > new > >> >> >> >> >> > > > > > > > > >>> > >> >>producer's > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>throughput is almost > >> >>similar > >> >> >>to > >> >> >> >>the > >> >> >> >> >> > consumer > >> >> >> >> >> > > > AND > >> >> >> >> >> > > > > > the > >> >> >> >> >> > > > > > > > > >>>fact > >> >> >> >> >> > > > > > > > > >>> > >>that > >> >> >> >> >> > > > > > > > > >>> > >> >>the > >> >> >> >> >> > > > > > > > > >>> > >> >> >>new > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>round-robin based > >>consumer > >> >> >> >> >>rebalancing > >> >> >> >> >> > can > >> >> >> >> >> > > > allow > >> >> >> >> >> > > > > > a > >> >> >> >> >> > > > > > > > very > >> >> >> >> >> > > > > > > > > >>> high > >> >> >> >> >> > > > > > > > > >>> > >> >> >>number of > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>mirror maker instances > >>to > >> >> >>exist. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>This is the end state > >>that > >> >>the > >> >> >> >> >>mirror > >> >> >> >> >> > maker > >> >> >> >> >> > > > > > should be > >> >> >> >> >> > > > > > > > > >>>in > >> >> >> >> >> > > > > > > > > >>> once > >> >> >> >> >> > > > > > > > > >>> > >> >>the > >> >> >> >> >> > > > > > > > > >>> > >> >> >>new > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>consumer is complete, > >>so it > >> >> >> >>wouldn't > >> >> >> >> >> > hurt to > >> >> >> >> >> > > > see > >> >> >> >> >> > > > > > if > >> >> >> >> >> > > > > > > > we > >> >> >> >> >> > > > > > > > > >>>can > >> >> >> >> >> > > > > > > > > >>> > >>just > >> >> >> >> >> > > > > > > > > >>> > >> >> >>move > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>to > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>that right now. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>On Fri, Jan 23, 2015 at > >> >>8:40 > >> >> >>PM, > >> >> >> >>Jay > >> >> >> >> >> > Kreps > >> >> >> >> >> > > > > > > > > >>> > >><jay.kr...@gmail.com > >> >> >> >> >> > > > > > > > > >>> > >> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>wrote: > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> QQ: If we ever use a > >> >> >>different > >> >> >> >> >> > technique > >> >> >> >> >> > > > for > >> >> >> >> >> > > > > > the > >> >> >> >> >> > > > > > > > data > >> >> >> >> >> > > > > > > > > >>> > >>channel > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>selection > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> than for the producer > >> >> >> >>partitioning > >> >> >> >> >> > won't > >> >> >> >> >> > > > that > >> >> >> >> >> > > > > > break > >> >> >> >> >> > > > > > > > > >>> > >>ordering? > >> >> >> >> >> > > > > > > > > >>> > >> >>How > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>can > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>we > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> ensure these things > >>stay > >> >>in > >> >> >> >>sync? > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> With respect to the > >>new > >> >> >> >> >>consumer--I > >> >> >> >> >> > really > >> >> >> >> >> > > > do > >> >> >> >> >> > > > > > want > >> >> >> >> >> > > > > > > > to > >> >> >> >> >> > > > > > > > > >>> > >> >>encourage > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>people > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>to > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> think through how MM > >>will > >> >> >>work > >> >> >> >> >>with > >> >> >> >> >> > the new > >> >> >> >> >> > > > > > > > consumer. > >> >> >> >> >> > > > > > > > > >>>I > >> >> >> >> >> > > > > > > > > >>> > >>mean > >> >> >> >> >> > > > > > > > > >>> > >> >>this > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>isn't > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> very far off, maybe a > >>few > >> >> >> >>months > >> >> >> >> >>if > >> >> >> >> >> we > >> >> >> >> >> > > > hustle? > >> >> >> >> >> > > > > > I > >> >> >> >> >> > > > > > > > > >>>could > >> >> >> >> >> > > > > > > > > >>> > >> >>imagine us > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>getting > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> this mm fix done maybe > >> >> >>sooner, > >> >> >> >> >>maybe > >> >> >> >> >> > in a > >> >> >> >> >> > > > > > month? > >> >> >> >> >> > > > > > > > So I > >> >> >> >> >> > > > > > > > > >>> guess > >> >> >> >> >> > > > > > > > > >>> > >> >>this > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>buys > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>us an > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> extra month before we > >> >>rip it > >> >> >> >>out > >> >> >> >> >>and > >> >> >> >> >> > throw > >> >> >> >> >> > > > it > >> >> >> >> >> > > > > > away? > >> >> >> >> >> > > > > > > > > >>>Maybe > >> >> >> >> >> > > > > > > > > >>> > >>two? > >> >> >> >> >> > > > > > > > > >>> > >> >> >>This > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>bug > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>has > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> been there for a > >>while, > >> >> >>though, > >> >> >> >> >> right? > >> >> >> >> >> > Is > >> >> >> >> >> > > > it > >> >> >> >> >> > > > > > worth > >> >> >> >> >> > > > > > > > > >>>it? > >> >> >> >> >> > > > > > > > > >>> > >> >>Probably > >> >> >> >> >> > > > > > > > > >>> > >> >> >>it > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>is, > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>but > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> it still kind of > >>sucks to > >> >> >>have > >> >> >> >>the > >> >> >> >> >> > > > duplicate > >> >> >> >> >> > > > > > > > effort. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> So anyhow let's > >> >>definitely > >> >> >> >>think > >> >> >> >> >> about > >> >> >> >> >> > how > >> >> >> >> >> > > > > > things > >> >> >> >> >> > > > > > > > > >>>will > >> >> >> >> >> > > > > > > > > >>> work > >> >> >> >> >> > > > > > > > > >>> > >> >>with > >> >> >> >> >> > > > > > > > > >>> > >> >> >>the > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>new > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> consumer. I think we > >>can > >> >> >> >>probably > >> >> >> >> >> just > >> >> >> >> >> > > > have N > >> >> >> >> >> > > > > > > > > >>>threads, > >> >> >> >> >> > > > > > > > > >>> each > >> >> >> >> >> > > > > > > > > >>> > >> >> >>thread > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>has > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>a > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> producer and consumer > >> >>and is > >> >> >> >> >> internally > >> >> >> >> >> > > > single > >> >> >> >> >> > > > > > > > > >>>threaded. > >> >> >> >> >> > > > > > > > > >>> > >>Any > >> >> >> >> >> > > > > > > > > >>> > >> >> >>reason > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>this > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> wouldn't work? > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> -Jay > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> On Wed, Jan 21, 2015 > >>at > >> >>5:29 > >> >> >> >>PM, > >> >> >> >> >> > Jiangjie > >> >> >> >> >> > > > Qin > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > >> >>>>><j...@linkedin.com.invalid> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> wrote: > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > Hi Jay, > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > Thanks for comments. > >> >> >>Please > >> >> >> >>see > >> >> >> >> >> > inline > >> >> >> >> >> > > > > > responses. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > Jiangjie (Becket) > >>Qin > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > On 1/21/15, 1:33 PM, > >> >>"Jay > >> >> >> >>Kreps" > >> >> >> >> >> > > > > > > > > >>><jay.kr...@gmail.com> > >> >> >> >> >> > > > > > > > > >>> > >> >>wrote: > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >Hey guys, > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >A couple > >> >> >>questions/comments: > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >1. The callback and > >> >> >> >> >> user-controlled > >> >> >> >> >> > > > commit > >> >> >> >> >> > > > > > > > offset > >> >> >> >> >> > > > > > > > > >>> > >> >> >>functionality > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>is > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> already > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >in the new consumer > >> >> >>which we > >> >> >> >> >>are > >> >> >> >> >> > > > working on > >> >> >> >> >> > > > > > in > >> >> >> >> >> > > > > > > > > >>> parallel. > >> >> >> >> >> > > > > > > > > >>> > >> >>If we > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> accelerated > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >that work it might > >> >>help > >> >> >> >> >> concentrate > >> >> >> >> >> > > > > > efforts. I > >> >> >> >> >> > > > > > > > > >>>admit > >> >> >> >> >> > > > > > > > > >>> > >>this > >> >> >> >> >> > > > > > > > > >>> > >> >> >>might > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>take > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >slightly longer in > >> >> >>calendar > >> >> >> >> >>time > >> >> >> >> >> but > >> >> >> >> >> > > > could > >> >> >> >> >> > > > > > still > >> >> >> >> >> > > > > > > > > >>> > >>probably > >> >> >> >> >> > > > > > > > > >>> > >> >>get > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>done > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>this > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >quarter. Have you > >>guys > >> >> >> >> >>considered > >> >> >> >> >> > that > >> >> >> >> >> > > > > > approach? > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > Yes, I totally agree > >> >>that > >> >> >> >> >>ideally > >> >> >> >> >> we > >> >> >> >> >> > > > should > >> >> >> >> >> > > > > > put > >> >> >> >> >> > > > > > > > > >>>efforts > >> >> >> >> >> > > > > > > > > >>> > >>on > >> >> >> >> >> > > > > > > > > >>> > >> >>new > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>consumer. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > The main reason for > >> >>still > >> >> >> >> >>working > >> >> >> >> >> on > >> >> >> >> >> > the > >> >> >> >> >> > > > old > >> >> >> >> >> > > > > > > > > >>>consumer > >> >> >> >> >> > > > > > > > > >>> is > >> >> >> >> >> > > > > > > > > >>> > >> >>that > >> >> >> >> >> > > > > > > > > >>> > >> >> >>we > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>expect > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> it > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > would still be used > >>in > >> >> >> >>LinkedIn > >> >> >> >> >>for > >> >> >> >> >> > > > quite a > >> >> >> >> >> > > > > > while > >> >> >> >> >> > > > > > > > > >>> before > >> >> >> >> >> > > > > > > > > >>> > >>the > >> >> >> >> >> > > > > > > > > >>> > >> >> >>new > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>consumer > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > could be fully > >>rolled > >> >>out. > >> >> >> >>And > >> >> >> >> >>we > >> >> >> >> >> > > > recently > >> >> >> >> >> > > > > > > > > >>>suffering a > >> >> >> >> >> > > > > > > > > >>> > >>lot > >> >> >> >> >> > > > > > > > > >>> > >> >>from > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>mirror > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > maker data loss > >>issue. > >> >>So > >> >> >>our > >> >> >> >> >> current > >> >> >> >> >> > > > plan is > >> >> >> >> >> > > > > > > > > >>>making > >> >> >> >> >> > > > > > > > > >>> > >> >>necessary > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>changes to > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > make current mirror > >> >>maker > >> >> >> >> >>stable in > >> >> >> >> >> > > > > > production. > >> >> >> >> >> > > > > > > > > >>>Then we > >> >> >> >> >> > > > > > > > > >>> > >>can > >> >> >> >> >> > > > > > > > > >>> > >> >> >>test > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>and > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > rollout new consumer > >> >> >> >>gradually > >> >> >> >> >> > without > >> >> >> >> >> > > > > > getting > >> >> >> >> >> > > > > > > > > >>>burnt. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >2. I think > >> >>partitioning > >> >> >>on > >> >> >> >>the > >> >> >> >> >> hash > >> >> >> >> >> > of > >> >> >> >> >> > > > the > >> >> >> >> >> > > > > > topic > >> >> >> >> >> > > > > > > > > >>> > >>partition > >> >> >> >> >> > > > > > > > > >>> > >> >>is > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>not a > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>very > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >good idea because > >>that > >> >> >>will > >> >> >> >> >>make > >> >> >> >> >> the > >> >> >> >> >> > > > case of > >> >> >> >> >> > > > > > > > going > >> >> >> >> >> > > > > > > > > >>> from > >> >> >> >> >> > > > > > > > > >>> > >>a > >> >> >> >> >> > > > > > > > > >>> > >> >> >>cluster > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>with > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >fewer partitions to > >> >>one > >> >> >>with > >> >> >> >> >>more > >> >> >> >> >> > > > > > partitions not > >> >> >> >> >> > > > > > > > > >>> work. I > >> >> >> >> >> > > > > > > > > >>> > >> >> >>think an > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >intuitive > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >way to do this > >>would > >> >>be > >> >> >>the > >> >> >> >> >> > following: > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >a. Default > >>behavior: > >> >> >>Just do > >> >> >> >> >>what > >> >> >> >> >> > the > >> >> >> >> >> > > > > > producer > >> >> >> >> >> > > > > > > > > >>>does. > >> >> >> >> >> > > > > > > > > >>> > >>I.e. > >> >> >> >> >> > > > > > > > > >>> > >> >>if > >> >> >> >> >> > > > > > > > > >>> > >> >> >>you > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> specify a > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >key use it for > >> >> >> >>partitioning, if > >> >> >> >> >> not > >> >> >> >> >> > just > >> >> >> >> >> > > > > > > > partition > >> >> >> >> >> > > > > > > > > >>>in > >> >> >> >> >> > > > > > > > > >>> a > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>round-robin > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >fashion. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >b. Add a > >> >> >> >>--preserve-partition > >> >> >> >> >> option > >> >> >> >> >> > > > that > >> >> >> >> >> > > > > > will > >> >> >> >> >> > > > > > > > > >>> > >>explicitly > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>inherent > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>the > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >partition from the > >> >>source > >> >> >> >> >> > irrespective > >> >> >> >> >> > > > of > >> >> >> >> >> > > > > > > > whether > >> >> >> >> >> > > > > > > > > >>> there > >> >> >> >> >> > > > > > > > > >>> > >>is > >> >> >> >> >> > > > > > > > > >>> > >> >>a > >> >> >> >> >> > > > > > > > > >>> > >> >> >>key > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>or > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> which > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >partition that key > >> >>would > >> >> >> >>hash > >> >> >> >> >>to. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > Sorry that I did not > >> >> >>explain > >> >> >> >> >>this > >> >> >> >> >> > clear > >> >> >> >> >> > > > > > enough. > >> >> >> >> >> > > > > > > > The > >> >> >> >> >> > > > > > > > > >>> hash > >> >> >> >> >> > > > > > > > > >>> > >>of > >> >> >> >> >> > > > > > > > > >>> > >> >> >>topic > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > partition is only > >>used > >> >> >>when > >> >> >> >> >>decide > >> >> >> >> >> > which > >> >> >> >> >> > > > > > mirror > >> >> >> >> >> > > > > > > > > >>>maker > >> >> >> >> >> > > > > > > > > >>> > >>data > >> >> >> >> >> > > > > > > > > >>> > >> >> >>channel > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>queue > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > the consumer thread > >> >>should > >> >> >> >>put > >> >> >> >> >> > message > >> >> >> >> >> > > > into. > >> >> >> >> >> > > > > > It > >> >> >> >> >> > > > > > > > > >>>only > >> >> >> >> >> > > > > > > > > >>> > >>tries > >> >> >> >> >> > > > > > > > > >>> > >> >>to > >> >> >> >> >> > > > > > > > > >>> > >> >> >>make > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>sure > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > the messages from > >>the > >> >>same > >> >> >> >> >> partition > >> >> >> >> >> > is > >> >> >> >> >> > > > sent > >> >> >> >> >> > > > > > by > >> >> >> >> >> > > > > > > > the > >> >> >> >> >> > > > > > > > > >>> same > >> >> >> >> >> > > > > > > > > >>> > >> >> >>producer > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>thread > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > to guarantee the > >> >>sending > >> >> >> >>order. > >> >> >> >> >> This > >> >> >> >> >> > is > >> >> >> >> >> > > > not > >> >> >> >> >> > > > > > at > >> >> >> >> >> > > > > > > > all > >> >> >> >> >> > > > > > > > > >>> > >>related > >> >> >> >> >> > > > > > > > > >>> > >> >>to > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>which > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > partition in target > >> >> >>cluster > >> >> >> >>the > >> >> >> >> >> > messages > >> >> >> >> >> > > > end > >> >> >> >> >> > > > > > up. > >> >> >> >> >> > > > > > > > > >>>That > >> >> >> >> >> > > > > > > > > >>> is > >> >> >> >> >> > > > > > > > > >>> > >> >>still > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>decided by > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > producer. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >3. You don't > >>actually > >> >> >>give > >> >> >> >>the > >> >> >> >> >> > > > > > > > > >>> ConsumerRebalanceListener > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>interface. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>What > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >is > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >that going to look > >> >>like? > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > Good point! I should > >> >>have > >> >> >>put > >> >> >> >> >>it in > >> >> >> >> >> > the > >> >> >> >> >> > > > > > wiki. I > >> >> >> >> >> > > > > > > > > >>>just > >> >> >> >> >> > > > > > > > > >>> > >>added > >> >> >> >> >> > > > > > > > > >>> > >> >>it. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >4. What is > >> >> >> >>MirrorMakerRecord? I > >> >> >> >> >> > think > >> >> >> >> >> > > > > > ideally > >> >> >> >> >> > > > > > > > the > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > >> >>>MirrorMakerMessageHandler > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >interface would > >>take a > >> >> >> >> >> > ConsumerRecord as > >> >> >> >> >> > > > > > input > >> >> >> >> >> > > > > > > > and > >> >> >> >> >> > > > > > > > > >>> > >>return a > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >ProducerRecord, > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >right? That would > >> >>allow > >> >> >>you > >> >> >> >>to > >> >> >> >> >> > > > transform the > >> >> >> >> >> > > > > > > > key, > >> >> >> >> >> > > > > > > > > >>> value, > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>partition, > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>or > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >destination > >>topic... > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > MirrorMakerRecord is > >> >> >> >>introduced > >> >> >> >> >>in > >> >> >> >> >> > > > > > KAFKA-1650, > >> >> >> >> >> > > > > > > > > >>>which is > >> >> >> >> >> > > > > > > > > >>> > >> >>exactly > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>the > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>same > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > as ConsumerRecord in > >> >> >> >>KAFKA-1760. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > private[kafka] class > >> >> >> >> >> > MirrorMakerRecord > >> >> >> >> >> > > > (val > >> >> >> >> >> > > > > > > > > >>> sourceTopic: > >> >> >> >> >> > > > > > > > > >>> > >> >> >>String, > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > val > >>sourcePartition: > >> >> >>Int, > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > val sourceOffset: > >> >>Long, > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > val key: > >>Array[Byte], > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > val value: > >> >>Array[Byte]) > >> >> >>{ > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > def size = > >> >>value.length > >> >> >>+ > >> >> >> >>{if > >> >> >> >> >> (key > >> >> >> >> >> > == > >> >> >> >> >> > > > > > null) 0 > >> >> >> >> >> > > > > > > > > >>>else > >> >> >> >> >> > > > > > > > > >>> > >> >> >>key.length} > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > } > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > However, because > >>source > >> >> >> >> >>partition > >> >> >> >> >> and > >> >> >> >> >> > > > offset > >> >> >> >> >> > > > > > is > >> >> >> >> >> > > > > > > > > >>>needed > >> >> >> >> >> > > > > > > > > >>> in > >> >> >> >> >> > > > > > > > > >>> > >> >> >>producer > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>thread > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > for consumer offsets > >> >> >> >> >>bookkeeping, > >> >> >> >> >> the > >> >> >> >> >> > > > record > >> >> >> >> >> > > > > > > > > >>>returned > >> >> >> >> >> > > > > > > > > >>> by > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > >> >>MirrorMakerMessageHandler > >> >> >> >>needs > >> >> >> >> >>to > >> >> >> >> >> > > > contain > >> >> >> >> >> > > > > > those > >> >> >> >> >> > > > > > > > > >>> > >> >>information. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>Therefore > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > ProducerRecord does > >>not > >> >> >>work > >> >> >> >> >>here. > >> >> >> >> >> We > >> >> >> >> >> > > > could > >> >> >> >> >> > > > > > > > > >>>probably > >> >> >> >> >> > > > > > > > > >>> let > >> >> >> >> >> > > > > > > > > >>> > >> >> >>message > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>handler > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > take ConsumerRecord > >>for > >> >> >>both > >> >> >> >> >>input > >> >> >> >> >> > and > >> >> >> >> >> > > > > > output. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >5. Have you guys > >> >>thought > >> >> >> >>about > >> >> >> >> >> what > >> >> >> >> >> > the > >> >> >> >> >> > > > > > > > > >>>implementation > >> >> >> >> >> > > > > > > > > >>> > >>will > >> >> >> >> >> > > > > > > > > >>> > >> >> >>look > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>like in > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >terms of threading > >> >> >> >>architecture > >> >> >> >> >> etc > >> >> >> >> >> > with > >> >> >> >> >> > > > > > the new > >> >> >> >> >> > > > > > > > > >>> > >>consumer? > >> >> >> >> >> > > > > > > > > >>> > >> >> >>That > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>will > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>be > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >soon so even if we > >> >>aren't > >> >> >> >> >>starting > >> >> >> >> >> > with > >> >> >> >> >> > > > that > >> >> >> >> >> > > > > > > > let's > >> >> >> >> >> > > > > > > > > >>> make > >> >> >> >> >> > > > > > > > > >>> > >> >>sure > >> >> >> >> >> > > > > > > > > >>> > >> >> >>we > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>can > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>get > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >rid > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >of a lot of the > >> >>current > >> >> >> >>mirror > >> >> >> >> >> maker > >> >> >> >> >> > > > > > accidental > >> >> >> >> >> > > > > > > > > >>> > >>complexity > >> >> >> >> >> > > > > > > > > >>> > >> >>in > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>terms > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>of > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >threads and queues > >> >>when > >> >> >>we > >> >> >> >> >>move to > >> >> >> >> >> > that. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > I haven¹t thought > >> >>about it > >> >> >> >> >> > throughly. The > >> >> >> >> >> > > > > > quick > >> >> >> >> >> > > > > > > > > >>>idea is > >> >> >> >> >> > > > > > > > > >>> > >> >>after > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>migration > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> to > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > the new consumer, > >>it is > >> >> >> >>probably > >> >> >> >> >> > better > >> >> >> >> >> > > > to > >> >> >> >> >> > > > > > use a > >> >> >> >> >> > > > > > > > > >>>single > >> >> >> >> >> > > > > > > > > >>> > >> >> >>consumer > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>thread. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > If multithread is > >> >>needed, > >> >> >> >> >> decoupling > >> >> >> >> >> > > > > > consumption > >> >> >> >> >> > > > > > > > > >>>and > >> >> >> >> >> > > > > > > > > >>> > >> >>processing > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>might > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>be > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > used. MirrorMaker > >> >> >>definitely > >> >> >> >> >>needs > >> >> >> >> >> > to be > >> >> >> >> >> > > > > > changed > >> >> >> >> >> > > > > > > > > >>>after > >> >> >> >> >> > > > > > > > > >>> > >>new > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>consumer > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>get > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > checked in. I¹ll > >> >>document > >> >> >>the > >> >> >> >> >> changes > >> >> >> >> >> > > > and can > >> >> >> >> >> > > > > > > > > >>>submit > >> >> >> >> >> > > > > > > > > >>> > >>follow > >> >> >> >> >> > > > > > > > > >>> > >> >>up > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>patches > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > after the new > >>consumer > >> >>is > >> >> >> >> >> available. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >-Jay > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >On Tue, Jan 20, > >>2015 > >> >>at > >> >> >>4:31 > >> >> >> >> >>PM, > >> >> >> >> >> > > > Jiangjie > >> >> >> >> >> > > > > > Qin > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > >>>>><j...@linkedin.com.invalid > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >wrote: > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> Hi Kafka Devs, > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> We are working on > >> >>Kafka > >> >> >> >> >>Mirror > >> >> >> >> >> > Maker > >> >> >> >> >> > > > > > > > > >>>enhancement. A > >> >> >> >> >> > > > > > > > > >>> > >>KIP > >> >> >> >> >> > > > > > > > > >>> > >> >>is > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>posted > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>to > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> document and > >> >>discuss on > >> >> >> >>the > >> >> >> >> >> > > > followings: > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> 1. KAFKA-1650: No > >> >>Data > >> >> >> >>loss > >> >> >> >> >> mirror > >> >> >> >> >> > > > maker > >> >> >> >> >> > > > > > > > change > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> 2. KAFKA-1839: To > >> >>allow > >> >> >> >> >> partition > >> >> >> >> >> > > > aware > >> >> >> >> >> > > > > > > > mirror. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> 3. KAFKA-1840: To > >> >>allow > >> >> >> >> >>message > >> >> >> >> >> > > > > > > > filtering/format > >> >> >> >> >> > > > > > > > > >>> > >> >>conversion > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> Feedbacks are > >> >>welcome. > >> >> >> >>Please > >> >> >> >> >> let > >> >> >> >> >> > us > >> >> >> >> >> > > > know > >> >> >> >> >> > > > > > if > >> >> >> >> >> > > > > > > > you > >> >> >> >> >> > > > > > > > > >>> have > >> >> >> >> >> > > > > > > > > >>> > >>any > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>questions or > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> concerns. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> Thanks. > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> Jiangjie (Becket) > >> >>Qin > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>-- > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>Thanks, > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>Neha > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> > >> >> >> >> >> > > > > > > > > >>> > >> > >> >> >> >> >> > > > > > > > > >>> > >> > >> >> >> >> >> > > > > > > > > >>> > > > >> >> >> >> >> > > > > > > > > >>> > > > >> >> >> >> >> > > > > > > > > >>> > >-- > >> >> >> >> >> > > > > > > > > >>> > >Thanks, > >> >> >> >> >> > > > > > > > > >>> > >Neha > >> >> >> >> >> > > > > > > > > >>> > > >> >> >> >> >> > > > > > > > > >>> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> > > > > > > > > >>> -- > >> >> >> >> >> > > > > > > > > >>> Thanks, > >> >> >> >> >> > > > > > > > > >>> Neha > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> > > > > > > > > > > >> >> >> >> >> > > > > > > > > > >> >> >> >> >> > > > > > > > > >> >> >> >> >> > > > > > > > > >> >> >> >> >> > > > > > > > >> >> >> >> >> > > > > > > > >> >> >> >> >> > > > > > > -- > >> >> >> >> >> > > > > > > Thanks, > >> >> >> >> >> > > > > > > Neha > >> >> >> >> >> > > > > > > >> >> >> >> >> > > > > > > >> >> >> >> >> > > > > >> >> >> >> >> > > > > >> >> >> >> >> > > >> >> >> >> >> > > >> >> >> >> >> > >> >> >> >> > > >> >> >> >> > > >> >> >> >> > > >> >> >> >> >-- > >> >> >> >> >Thanks, > >> >> >> >> >Neha > >> >> >> >> > >> >> >> >> > >> >> >> > >> >> >> > >> >> > >> >> > >> > > >> > > >> >-- > >> >Thanks, > >> >Neha > >> > >> > >