KAFKA-1650 added this option, but we don’t have ―no.data.loss in any official release.
On 2/26/15, 12:01 PM, "Gwen Shapira" <gshap...@cloudera.com> wrote: >Did --no.data.loss exist in previous releases of irrorMaker? >If it does, maybe we want to keep it around for backward compatibility? >(i.e. so existing dployments won't break)? > >Gwen > >On Thu, Feb 26, 2015 at 11:57 AM, Jiangjie Qin <j...@linkedin.com.invalid> >wrote: > >> Hi Neha, >> >> Thanks for the comment. That’s a really good point. >> >> Originally I’m thinking about allowing user to tweak some parameter as >> needed. >> For example, some user might want to hae pipeline enabled and can >> tolerate reordering, some user might want to use acks=1 or acks=0, some >> might want to move forward when error is encountered in callback. >> So we don’t want to enforce all the settings of no.data.loss. Meanwhile >>we >> want to make the life easier for the users who want no data loss so they >> don’t need to set the configs one by one, therefore w created this >>option. >> >> But as you suggested, we can probably make no.data.loss settings to be >> default and removed the ―no.data.loss option, so if people want to tweak >> the settngs, they can just change them, otherwise they get the default >> no-data-loss settings. >> >> I’ll modify the KIP. >> >> Thanks. >> >> Jiangjie (Becket) Qin >> >> On 2/26/15, 8:58 AM, "Neha Narkhede" <n...@confluent.io> wrote: >> >> >Hey Becket, >> > >> >The KIP proposes addition of a --no.data.loss command line option to >>the >> >MirrorMaker. Though when would te user not want that option? I'm >> >wondering >> >what the benefit of providing that option is if every user would want >>that >> >for correct mirroring behavior. >> > >> >Other than that, the KIP looks great! >> > >> >Thanks, >> >Neha >> > >> >On Wed, Feb 25, 2015 at 3:56 PM, Jiangjie Qin >><j...@linkedin.com.invalid> >> >wrote: >> > >> >> For 1), the current design allow you to do it. The customizable >>message >> >> handler takes in a ConsumerRecord and spit a List<ProducerRecord>, >>you >> >>can >> >> just put a topic for the ProducerRecord different from >>ConsumerRecord. >> >> >> >> WRT performance, we did some test in LinkedIn, the performance looks >> >>good >> >> to us. >> >> >> >> Jiangjie (Becket) Qin >> >> >> >> On 2/25/15, 3:41 PM, "Bhavesh Mistry" <mistry.p.bhav...@gmail.com> >> >>wrote: >> >> >> >> >Hi Jiangjie, >> >> > >> >> >It might be too late. But, I wanted to bring-up following use case >>for >> >> >adopting new MM: >> >> > >> >> >1) Ability to publish messge from src topic to different >>destination >> >> >topic >> >> >via --overidenTopics=srcTopic:newDestinationTopic >> >> > >> >> >In order to adopt, new MM enhancement customer will compare >> >>performance of >> >> >new MM and data quality while running old MM against same >>destination >> >> >cluster in Prd. >> >> > >> >> >Let me know if you agree to that or not. Also, If yes, will be >>able to >> >> >able to provide this feature in release version. >> >> > >> >> >Thanks, >> >> > >> >> >Bhavesh >> >> > >> >> > >> >> >On Tue, Feb 24, 2015 at 5:31 PM, Jiangjie Qin >> >><j...@linkedin.com.invalid> >> >> >wrote: >> >> > >> >> >> Sure! Just created the voting thread :) >> >> >> >> >> >> On 2/24/5, 4:44 PM, "Jay Kreps" <j...@confluent.io> wrote: >> >> >> >> >> >> >Hey Jiangjie, >> >> >> > >> >> >> >Let's do an official vote so that we kow what we are voting on >>and >> >>we >> >> >>are >> >> >> >crisp on what the outcome was. This thread is very long :- >> >> >> > >> >> >> >-Jay >> >> >> > >> >> >> >On Tue, Feb 24, 2015 at 2:53 PM, Jiangjie Qin >> >> >><j...@linkedin.com.invalid> >> >> >> >wrote: >> >> >> > >> >> >> >> I updated the KIP page based on the discussion we had. >> >> >> >> >> >> >> >> Should I launch another vote or we can think of this mail >>thread >> >>has >> >> >> >> already included a vote? >> >> >> >> >> >> >> >> Jiangjie (Becket) Qin >> >> >> >> >> >> >> >> On 2/11/15, 5:15 PM, "Neha Nakhede" <n...@confuent.io> wrote: > >> >> >> >> >> >> >> >Thanks for the explanation, Joel! Would love to see the >>results >> >>of >> >> >>the >> >> >> >> >throughput experiment and I'm a +1 on everything els, ncluding >> >>the >> >> >> >> >rebalance callback and record handler. >> >> >> >> > >> >> >> >> >-Neha >> >> >> >> > >> >> >> >> >On Wed, Feb 11, 2015 at 1:13 PM Jay Kreps >><jay.kr...@gail.com> >> >> >>wrote: >> >> >> >> > >> >> >> >> >> Cool, I agree with all that. >> >> >> >> >> >> >> >> >> >> I agree about the need for a rebalancing callback. >> >> >> >> >> >> >> >> >> >> Totally agree about record handler. >> >> > >> >> >> >> >> >> >> It would be great to see if a prototype of this is workable. >> >> >> >> >> >> >> >> >> >> Thanks guys! >> >> >> >> >> >> >> >> >> >> -Jay >> >> >> >> >> >> >> >> >> >> On Wed, Feb 11 2015 at 12:36 PM, Joel Koshy >> >><jjkosh...@gmail.com >> >> > >> >> >> >> >>wrote: >> >> >> >> >> >> >> >> >> >> > Hey Jay, >> >> >> >> >> > >> >> >> >> >> > Guozhang, Becket and I got togethe to discus this and we >> >> >>think: >> >> >> >> >> > >> >> >> >> >> > - It seems that your proposal based on the new consumr and >> >>flush >> >> >> >>call >> >> >> >> >> > should work. >> >> >> >> >> > - We would likely need to call the poll with a timeout >>that >> >> >>matches >> >> >> >> >> > the offset ommit interval in ordr to deal with low >>volume >> >> >> >> >> > mirroring pipelines. >> >> >> >> >> > - We will still need a reblnce callbackto reduce >> >>duplicates - >> >> >> >>the >> >> >> >> >> > rebalance callback would need to flush and ommit >>offsets. >> >> >> >> >> > - The only remaining question is if the overall >>throughput is >> >> >> >> >> > sufficient. I think someone at LinkedIn (I don't >>remember >> >>who) >> >> >> >>did >> >> >> >> >> > some experimens with data channel size == 1 and ran >>into >> >> >>issues. >> >> >> >> >> > That was not thoroughly investigated though. >> >> >> >> >> > - The addition of flush may actually make this solution >> >>viable >> >> >>for >> >> >> >>the >> >> >> >> >> > current mirror-maker (wih the old consumer). We can >> >>prototype >> >> >> >>that >> >> >> >> >> > offline and if it works out well we can redo KAFKA-1650 >> >>(i.e., >> >> >> >> >> > refactor the current mirror make). The flush call and >>the >> >>new >> >> >> >> >> > consumer didn't exist at the time we did KAFKA-1650 so >>this >> >> >>did >> >> >> >>not >> >> >> >> >> > occur to us. >> >> >> >> >> > - We think the RecordHandler is still a useful small >>addition >> >> >>for >> >> >> >>the >> >> >> >> >> > use-cases mentioned earlier in this thread. >> >> >> >> >> > >> >> >> >> >> > Thanks, >> >> >> >> >> > >> >> >> >> >> > Joel >> >> >> >> >> > >> >> >> >> >> > On Wed, Feb 11, 2015 at 09:05:39AM -0800, Jay Kreps wrote: >> >> >> >> >> > > Guozhang, I agree with 1-3, I do think what I was >>proposing >> >> >>was >> >> >> >> >>simpler >> >> >> >> >> > but >> >> >> >> >> > > perhaps there re gaps in that? >> >> >> >> >> > > >> >> >> >> >> > > Hey Joel--Here was a sketch of what I was proposing. I >>do >> >> >>think >> >> >> >>this >> >> >> >> >> > get's >> >> >> >> >> > > rid of manual offset tracking, espcially doing so across >> >> >>threads >> >> >> >> >>with >> >> >> >> >> > > dedicated commit threads, which I think is prety >>complex. >> >> >> >> >> > > >> >>>> >> >> > > while(true) { >> >> >> >> >> > > val recs = consumer.poll(Long.MaxValue); >> >> >> >> >> > > for (rec <- recs) >> >> >> >> >> > > producer.sd(rec, logErrorCallback) >> >> >> >> >> > > if(System.currentTimeMillis - lastCommit > >> >> >>commitInterval) { >> >> >> >> >> > > producer.flush() >> >> >> >> >> > > consumer.commit() >> >> >> >> >> > > lastCommit = System.currentTimeMillis >> >> >> >> >> > > } >> >> >> >> >> > > } >> >> >> >> >> > > >> >> >> >> >> > > (See the prevous email for details). I think the >>question >> >> >>is: is >> >> >> >> >>there >> >> >> >> >> > any >> >> >> >> >> > > reason--performance, correctness, etc--that this won't >> >>work? >> >> >> >> >>Basically >> >> >> >> >> I >> >> >> >> >> > > think you guys have thought about this more so I may be >> >> >>missing >> >> >> > >> > something. >> >> >> >> > > > If so let's flag it while we still have leeway on the >> >> >>consumer. >> >> >> >> >> > > >> >> >> >> >> > > If we think that will work, well I do think it is >> > >>conceptually a >> >> >> >>lot >> >> >> >> >> > > simpler than the current code, though I suppose one >>could >> >> >> >>disagree >> >> >> >> >>on >> >> >> >> >> > that. >> >> >> >> >> > > >> >> >> >> >> > > -Jay >> >> >> >> >> > > >> >> >> >> >> > > On Wed, Feb 11, 2015 at 5:53 AM, Joel Koshy >> >> >><jjkosh...@gmail.com >> >> >> > >> >> >> >> >> wrote: >> >> >> >> >> > > >> >> >> > >> > > > Hi Jay, >> >> >> >> >> > > > >> >> >> >> >> > > > > The data channels are actually a big part of the >> >> >>complexity >> >> >> >>of >> >> >> >> >>the >> >> >> >> >> > zero >> >> >> >> >> > > > > data loss design, though, right? Because then you >>need >> >> >>ome >> >> >> >> >>reverse >> >> >> >> >> > > > channel >> >> >> >> >> > > > > to flo the acks back to the consumer based on where >>you >> >> >>are >> >> >> >> >>versus >> >> >> >> >> > just >> >> >> >> >> > > > > acking what you have read and written (as in the >>code >> >> >> >>snippet I >> >> >> >> >>put >> >> >> >> >> > up). >> >> >> >> >> >> > >> >> >> >> >> > > > I'm not sure if we are on the same page. Even if the >>data >> >> >> >>channel >> >> >> >> >>was >> >> >> >> >> > > > not there the current handling fr zero data loss would >> >> >>remain >> >> >> >> >>very >> >> > >> >> > > > similar - you would need to maintain lists of unacked >> >>source >> >> >> >> >>offsets. >> >> >> >> >> > > > I'm wondering if the KIP needs more detail on how it >>is >> >> >> >>currenly >> >> >> >> >> > > > implemented; or are suggesting a different approach >>(in >> >> >>which >> >> >> >> >>case I >> >> >> >> >> > > > have notfully understood). I'm not sure whatyou mean >>by >> >> >> >>flowing >> >> >> >> >> acks >> >> >> >> >> > > > back to the consumer - the MM commits offsets after >>the >> >> >> >>producer >> >> >> >> >>ack >> >> >> >> >> > > > has been received. There is some additional complexity >> >> >> >>introduced >> > >> >> >>in >> >> >> >> >> > > > reducing duplicates on a rebalance - this is actually >> >> >>optional >> >> >> >> >>(since >> >> >> >> >> > > > duplicates are currently a given). The reson that was >> >>done >> >> >> >> >>anyway is >> >> >> >> >> > > > that with the auto-commit turned off duplicates are >> >>almost >> >> >> >> >>guaranteed >> >> >> >> >> > > > on a rebalance. >> >> >> >> >> > > > >> >> >> >> >> > > > > I think the point that Neha and I were trying to >>make >> >>was >> >> >> >>that >> >> >> >> >>the >> >> >> >> >> > > > > motivation to embed stuff into MM kindof is related >>to >> >> >>how >> >> >> >> >> complex a >> >> >> >> >> > > > > simple "consume and prouce" with good throughput >>will >> >> >>be. If >> >> >> >> >>it is >> >> >> >> >> > > > simple >> >> >> >> >> > > > > to write such a thing in a few lines, the pain of >> >> >>embedding a >> >> >> >> >>bunch >> >> >> >> >> > of >> >> >> >> >> > > > > stuff won't be worth it, if it has to be as complex >>as >> >>the >> >> >> >> >>current >> >> >> >> >> mm >> >> >> >> >> > > > then >> >> >> >> >> > > > > of course we will need all kinds of plug ins >>because no >> >> >>one >> >> >> >> >>will be >> >> >> >> > > able >> >> >> >> >> > > > to >> >> >> >> >> > > > > write such a thing. I don't have a huge concern >>with a >> >> >>simple >> >> >> >> >> plug-in >> >> >> >> >> > > > but I >> >> >> >> >> > > > > think if i turns into something more complex with >> >> >>filtering >> >> >> >>and >> >> >> >> >> > > > > aggregation or whatever we really need to stop and >> >>think a >> >> >> >>bit >> >> >> >> >> about >> >> >> >> >> > the >> >> >> > >> > > > > design. >> >> >> >> >> > > > >> >> >> >> >> > > > I agree - I don't think there is a usecase for any >> >>comple >> >> >> >> >>plug-in. >> >> >> >> >> > > > It is pretty much what Becket has described curently >>for >> >> >>the >> >> >> >> >message >> >> >> >> >> > > > handler - i.e., take an incoming record and return a >> >>list of >> >> >> >> >>outgoing >> >> >> >> >> > > > records (which could be empty if you filter). >> >> >> >> >> > > > >> >> >> >> >> > > > So here is my ake on the MM: >> >> >> >> >> > > > - Bare bones: simple consumer - producer pairs (07 >> >>style). >> >> >> >>This >> >> >> >> >>is >> >> >> >> >> > > > ideal, but does not handle no data los >> >> >> >> >> > > > - Above plus support no data loss. This actually adds >> >>quite >> >> >>a >> >> >> >>bit >> >> >> >> >>of >> >> >> >> >> > > > complexity. >> >> >> >> >> > > > - Above plus the message handler. This is a trivial >> >> >>addition I >> >> >> >> >>think >> >> >> >> >> > > > that makes the MM usable in a few other >>mirroring-like >> >> >> >> >> applications. >> >> >> >> >> > > > >> >> >> >> >> > > > Joel >> >> >> >> >> > > > >> >> >> >> >> > > > > On Tue, Feb 10, 2015 at 12:31 PM, Joel Koshy >> >> >> >> >><jjkosh...@gmail.com> >> >> >> >> >> > > > wrote: >> >> >> >> >> > > > > >> >> >> >> >> > > > > > >> >> >> >> >> > > > > > >> >> >> >> >> > > > >> On Tue, Feb 10, 2015 at 12:13:46PM -0800, Neha >> >>Narkhede >> >> >> >>wrote: >> >> >> >> >> > > > > > > I think all of us agree that we want to design >> >> >> >>MirrorMaker >> >> >> >> >>for >> >> >> >> >> 0 >> >> >> >> >> > data >> >> >> >> >> > > > > > loss. >> >> >> >> >> > > > > > > With the absence of the data channel, 0 data >>loss >> >> >>will be >> >> >> >> >>much >> >> >> >> >> > > > simpler to >> >> >> >> >> > > > > > > implement. >> >> >> >> >> > > > > > >> > >> >> >> > > > > > The data channel is irrelevant to theimplementation >> >>of >> >> >> >>zero >> >> >> >> >>data >> >> >> >> >> > > > > > loss. The complexity in the implementation of no >>data >> >> >>loss >> >> >> >> >>that >> >> >> >> >> you >> >> >> >> >> > > > > > are seeing in mirror-maker affects all >> >> >>consume-then-produce >> >> >> >> >> > patterns >> >> >> >> >> > > > > > whether or not there is a data hannel. You still >> >>need >> >> >>to >> >> >> >> >> maintain a >> >> >> >> >> > > > > > list of unacked offsets. What I meant earlier is >> >>that we >> >> >> >>can >> >> >> >> >> > > > > > brainstorm completely different approaches to >> >> >>supporting no >> >> >> >> >>data >> >> >> >> >> > loss, >> >> >> >> >> > > > > > but the current implementation is the only >>solution >> >>we >> >> >>ar >> >> >> >> >>aware >> >> >> >> >> > of. >> >> >> >> >> > > > > > >> >> >> >> >> > > > > > > >> >> >> >> >> > > > > > > My arguments for adding a message handler are >>that: >> >> >> >> >> > > > > > > > 1. It is more efficient to do something in >>common >> >> >>for >> >> >> >>all >> >> >> >> >>the >> >> >> >> >> > > > clients >> >> >> >> >> > > > > > in >> >> >> >> >> > > > > > > > pipeline than letting each client do thesame >> >>thing >> >> >>for >> >> >> >> >>many >> >> >> >> >> > > > times. And >> >> >> >> >> > > > > > > > there are concrete use cases for the message >> >>handler >> >> >> >> >>already. >> >> >> >> >> > > > > > > > >> >> >> >>>> > > > > > > >> >> >> >> >> > > > > > > What are the concrete use cases? >> >> >> >> >> > > > > > >> >> >> >> >> > > > > > I think Becket alrady described a couple of use >> >>cases >> >> >> >> >>earlier in >> >> >> >> >> > the >> >> >> >> >> > > > > > thread. >> >> >> >> >> > > > > > >> >> >> >> >> > > > > > <quote> >> >> >> >> >> > > > > > >> >> >> >> >> > > > >> 1. Format conversion. We have a use case where >> >>clients >> >> >>of >> >> >> >> >>source >> >> >> >> >> > > > > > cluster >> >> >> >> >> > > > > > use an internal schema and clients of target >>cluster >> >> >>use a >> >> >> >> >> > different >> >> >> >> >> > > > > > public schema. >> >> >> >> >> > > > > > 2. Message filtering: For the messges published >>to >> >> >>source >> >> >> >> >> cluster, >> >> >> >> >> > > > > > there >> >> >> >> >> > > > > > ar some messages private to source cluster clients >> >>and >> >> >> >>should >> >> >> >> >> not >> >> >> >> >> > > > > > exposed >> >> >> >> >> > > > > > to target cluster clients. It would be difficult >>to >> >> >>publish >> >> >> >> >>those >> >> >> >> >> > > > > > messages >> >> >> >> >> > > > > > into different partitions because they need to be >> >> >>ordered. >> >> >> >> >> > > > > > I agree that we can always filter/convert messages >> >>after >> >> >> >>they >> >> >> >> >>are >> >> >> >> >> > > > > > copied >> >> >> >> >> > > > > > to thetarget cluster, but that costs network >> >>bandwidth >> >> >> >> >> > unnecessarily, >> > >> >> >> > > > > > especially if that is a cross colo mirror. With the >> >> >> >>handler, >> >> >> >> >>we >> >> >> >> >> can >> >> >> >> >> > > > > > co-locate the mirror maker with source cluster and >> >>save >> >> >> >>that >> >> >> >> >> cost. >> >> >> >> >> > > > > > Also, >> >> >> >> >> > > > > > imagine there are many downstream consumers >>consuming >> >> >>from >> >> >> >>the >> >> >> >> >> > target >> >> >> >> >> > > > > > cluster, filtering/reformatting the messages >>before >> >>the >> >> >> >> >>messages >> >> > >> >> > reach >> > >> >> >> > > > > > te >> >> >> >> >> > > > > > target cluster is much more efficient than having >> >>each >> >> >>of >> >> >> >>the >> >> >> >> >> > > > > > consumers do >> >> >> >> >> > > > > > this individually on their own. >> >> >> >> >> > > > > > >> >> >> >> >> > > > > > </quote> >> >> >> >> >> > > > > > >> >> >> >> >> > > > > > > >> >> >> >> >> > > > > > > Also the KIP still refers to he datachannel in a >> >>few >> >> >> >> >>places >> >> >> >> >> > > > (Motivation >> >> >> >> >> > > > > > > and "On consumer rebalance" sections). Can you >> >>update >> >> >>the >> >> >> >> >>wiki >> >> >> >> >> > so it >> >> >> >> >> > > > is >> >> >> >> >> > > > > > > easier to review the new design, specially the >> >>data >> >> >>loss >> >> >> >> >>part. >> >> >> >> >> > > > > > > >> >> >> >> >> > > > > > > >> >> >> >> > > > > > > > On Tue, Feb 10, 2015 at 10:36 AM, Joel Koshy < >> >> >> >> >> > jjkosh...@gmail.com> >> >> >> >> >> > > > > > wrote: >> >> >> >> >> > > > > > > >> >> >> >> >> > > > > > > > I think the message handler adds little to >>no>> >> >> >>complexity >> >> >> >> >>to >> >> >> >> >> the >> >> >> >> >> > > > mirror >> >> >> >> >> > > > > > > > maker. Jay/Neha, the MM became scary due to >>the >> >> >> >> >> rearchitecture >> >> >> >> >> > we >> >> >> >> >> > > > did >> >> >> >> >> > > > > > > > for 0.8 due to performance issues compared >>with >> >>0.7 >> >> >>- >> >> >> >>we >> >> >> >> >> should >> >> >> >> >> > > > remove >> >> >> >> >> > > > > > > > the data channel if it can match the current >> >> >> >>throughput. I >> >> >> >> >> > agree >> >> >> >> >> > > > it is >> >> >> >> >> > > > > > > worth prototyping and testing that so the MM >> >> >> >>architecture >> >> >> >> >>is >> >> >> >> >> > > > > > > > simplified. >> >> >> >> >> > > > > > > >> >> >> >> >> > > > > > > > The MM became a little scarier in KAFKA-1650 >>in >> >> >>order >> > >> >>to >> >> >> >> >> > support no >> >> >> >> >> > > > > > > > data loss. I think the implementation for no >>data >> >> >>loss >> >> >> >> >>will >> >> >> >> >> > remain >> >> >> >> >> > > > > > > > about the same even in the new model (even >> >>without >> >> >>the >> >> >> >> >>data >> >> >> >> >> > > > channel) - >> >> >> >> >> > > > > > > > we can probably brainstorm more if there is a >> >> >> >> >>better/simpler >> >> >> >> >> > way >> >> >> >> >> > > > to do >> >> >> >> >> > > > > > > > it (maybe there is in the absence of the data >> >> >>channel) >> >> >> >> >>but at >> >> >> >> >> > the >> >> >> >> >> > > > time >> >> >> >> >> > > > > > > > it was the best we (i.e., Becket, myself, Jun >>and >> >> >> >>Guozhang >> >> >> >> >> who >> >> >> >> >> > > > > > > > participated on the review) could come up >>with. >> >> >> >> >> > > > > > > > >> >> >> >> >> > > > > > > > So I'm definitely +1 on whatever it takes to >> >> >>support no >> >> >> >> >>data >> >> >> >>>> > lss. >> >> >> >> >> > > > I >> >> >> >> >> > > > > > > > think most people would want that out of the >>box. >> >> >> >> >> > > > > > > > >> >> >> >> >> > > > > > > > As for the message handler, as Becket wrote >>and I >> >> >>agree >> >> >> >> >>with, >> >> >> >> >> > it is >> >> >> >> >> > > > > > > > really a trivial addition that would benefit >> >> >>(perhaps >> >> >> >>not >> >> >> >> >> most, >> >> >> >> >> > > > but at >> >> >> >> >> > > > > > > > least some). So I'm personally +1 on that as >> >>well. >> >> >>That >> >> >> >> >>said, >> >> >> >> >> > I'm >> >> >> >> >> > > > also >> >> >> >> >> > > > > > > > okay with it not being there. I think the MM >>is >> >> >>fairly >> >> >> >> >> > stand-alone >> >> >> >> >> > > > and >> >> >> >> >> > > > > > > > simpe enough that it is entirely reasonable >>and >> >> >> >> >>absolutely >> >> >> >> >> > > > feasible >> >> >> >> >> > > > > > > > or companies to fork/re-implement the mirror >> >>maker >> >> >>for >> >> >> >> >>their >> >> >> >> >> > own >> >> >> >> >> > > > > > > > needs. >> >> >> >> >> > > > > > > > >> >> >> >> >> > > > > > > > So in summary, I'm +1 on the KIP. >> >> >> >> >> > > > > > > > >> >> >> >> >> > > > > > > > Thanks, >> >> >> >> >> > > > > > > > >> >> >> >> >> > > > > > > > Joel >> >> >> >> >> > > > > > > > >> >> >> >> >> > > > > > > > On Mon, Feb 09, 2015 at 09:19:57PM +0000, >> >>Jiangjie >> >> >>Qin >> >> >> >> >>wrote: >> >> >> >> >> > > > > > > > > I just updated the KIP page and incorporated >> >>Jay >> >> >>and >> >> >> >> >>Neha’s >> >> >> >> >> > > > > > suggestion. >> >> >> >> >> > > > > > > > As >> >> >> >> >> > > > > > > > > a brief summay of where we are: >> >> >> >> >> > > > > > > > > >> >> >> >> >> > > > > > > > > Consensus reached: >> >> >> >> >> > > > > > > > > Have N independent mirror maker threads each >> >>has >> >> >> >>their >> >> >> >> >>own >> >> >> >> >> > > > consumers >> >> >> >> >> > > > > > but >> >> >> >> >> > > > > > > > > share a producer. The mirror maker threads >> >>will be > >> >> >> >> > responsible >> >> >> >> >> > > > for >> >> >> >> >> > > > > > > > > decompression, compression and offset commit >> >>No >> >> >>data >> >> >> >> >> > channel and >> >> >> >> >> > > > > > > > separate >> >> >> >> >> > > > > > > > > offset commit thread is needed. Consumer >> >>rebalance >> >> >> >> >>callback >> >> >> >> >> > will >> >> >> >> >> > > > be >> >> >> >> >> > > > > > used >> >> >> >> >> > > > > > > > > to avoid duplicates on rebalance. >> >> >> >> >> > > > > > > > > >> >> >>>> >> > > > > > > > > Still under discussion: >> >> >> >> >> > > > > > > > > Whether message handler is needed. >> >> >> >> >> > >> > > > > > >> >> >> >> >> > > > > > > > > My arguments for adding a message handler >>are >> >> >>that: >> >> >> >> >> > > > > > > > > 1. It is more efficient to do something in >> >>common >> >> >>for >> >> >> >> >>all >> >> >> >> >> the >> >> >> >> >> > > > > > clients in >> >> >> >> >> > > > > > > > > pipeline than letting each client do the >>same >> >> >>thing >> >> >> >>for >> >> >> >> >> many >> >> >> >> >> > > > times >> >> >> >> >> > > > > > And >> >> >> >> >> > > > > > > > > there are concrete use cases for the message >> >> >>handler >> >> >> >> >> already. >> >> >> >> >> > > > > > > > > 2. It is not a big complicated add-on to >>mirror >> >> >> >>maker. >> >> >> >> >> > > > > > > > > 3.Without a message handler, for customers >> >>needs >> >> >>it, >> >> >> >> >>they >> >> >> >> >> > have >> >> >> >> >> > > > to >> >> >> >> >> > > > > > > > > re-implement all the logics of mirror maker >>by >> >> >> >> >>themselves >> >> >> >> >> > just in >> >> >> >> >> > > > > > order >> >> >> >> >> > > > > > > > to >> >> >> >> >> > > > > > > > > ad this handling in pipeline. >> >> >> >> >> > > > > > > > > >> >> >> >> >> > > > > > > > > Any thoughts? >> >> >> >> >> > > > > > > > > >> >> >> >> >> > > > > > > > > Thanks. >> >> >> >> >> > > > > > > > > >> >> >> >> >> > > > > > > > > ―Jiangjie (Becket) Qin >> >> >> >> >> > > > > > > > > >> >> >> >> >> > > > > > > > > On 2/8/15, 6:35 PM, "Jiangjie Qin" >> >> >> >>j...@linkedin.com> >> >> >> >> >> > wrote: >> >> >> >> >> > > > > > > > > >> >> >> >> >> > > > > > > > > >Hi Jay, thanks a lot for the comments. >> >> >> >> >> > > > > > > > > >I think this solution is better. We >>probably >> >> >>don’t >> >> >> >>need >> >> >> >> >> data >> >> >> >> >> > > > channel >> >> >> >> >> > > > > > > > > >anymore. It canbe replaced with a list of >> >> >>producer > >> >> >>if >> >> >> >> >>we >> >> >> >> >> > need >> >> >> >> >> > > > more >> >> >> >> >> > > > > > > > sender >> >> >> >> >> > > > > > > > > >thread. >> >> >> >> >> > > > > > > > > I’ll update the KIP page. >> >> >> >> >> > > > > > > > > > >> >> >>>> >> > > > > > > > > >The reasoning about message handler is >>mainly >> >>for >> >> >> >> >> efficiency >> >> >> >> >> > > > > > purpose. >> >> >> >> >> > > > > > > > I’m >> >> >> >> >> > > > > > > > > >thinking that if something can be done in >> >> >>pipeline >> >> >> >>for >> >> >> >> >>all >> >> >> >> >> > the >> >> >> >> >> > > > > > clients >> >> >> >> >> > > > > > > > > >such as filtering/reformatting, it is >>probably >> >> >> >>better >> >> >> >> >>to >> >> >> >> >> do >> >> >> >> >> > it >> > >> >> >> > > > in >> >> >> >> >> > > > > > the >> >> >> >> >> > > > > > > > >pipeline than asking 100 clients do the same >> >> >>thing >> >> >> >>for >> >> >> >> >>100 >> >> >> >> >> > > > times. >> >> >> >> >> > > > > > > > > > >> >> >> >> >> > > > > > > > > >―Jiangjie (Becket) Qin >> >> >> >> >> > > > > > > > > > >> >> >> >> >> > > > > > > > > > >> >> >> >> >> > > > > > > > > >On 2/8/15, 4:59 PM, "Jay Kreps" >> >> >> >><jay.kr...@gmail.co> >> >> >> >> >> > wrote: >> >> >> >> >> > > > > > > > > > >> >> >> >> >> > > > > > > > > >>Yeah, I second Neha's comments. The >>current >> >mm >> >> >>code >> >> >> >> >>has >> >> >> >> >> > taken >> >> >> >> >> > > > > > something >> >> >> >> >> > > > > > > > > >>pretty simple and made it pretty scary >>with >> >> >> >>callbacs >> >> >> >> >>and >> >> >> >> >> > > > > > wait/notify >> >> >> >> >> > > > > > > > > >>stuff. Do we believe this works? Ican't >> >>tell by >> >> >> >> >>looking >> >> >> >> > > at it >> >> >> >> >> > > > > > which is >> >> >> >> >> > > > > > > > > >>kind of bad for something important like >> >>this. I >> >> >> >>don't >> >> >> >> >> mean >> >> >> >> >> > > > this as >> >> >> >> >>> > > > > > > > >>criticism, I know the history: we added in >> >> >>memory >> >> >> >> >>ueues >> >> >> >> >> to >> >> >> >> >> > > > help >> >> >> >> >> > > > > > with >> >> >> >> >> > > > > > > > > >>other >> >> >> >> >> > > > > > > > > >>performance problems without thinking >>about >> >> >> >> >>correctness, >> >> >> >> >> > then >> >> >> >> >> > > > we >> >> >> >> >> > > > > > added >> >> >> >> >> > > > > > > > > >>stuff to work around the in-memory queues >>not >> >> >>lose >> >> >> >> >>data, >> >> >> >> >> > and >> >> >> >> >> > > > so on. >> >> >> >> >> > > > > > > > > >> >> >> >> >> >> > > > > > > > > >>Can we instead do the pposite exercise and >> >> >>start >> >> >> >>with >> >> >> >> >> the >> >> >> >> >> > > > basics >> >> >> >> >> > > > > > of >> >> >> >> >> > > > > > > > what >> >> >> >> >> > > > > > > > > >>mm should do and think about what >> >>deficiencies >> >> >> >> >>prevents >> >> >> >> >> > this >> >> >> >> >> > > > > > approach >> >> >> >> >> > > > > > > > > >>from >> >> >> >> >> > > > > > > > > >>working? Then let's make sure the >>currently >> >> >> >>in-flight >> >> >> >> >> work >> >> >> >> >> > will >> >> >> >> >> > > > > > remove >> >> >> >> >> > > > > > > > > >>these deficiencies. After all mm is kind >>of >> >>the >> >> >> >> >> > prototypical >> >> >> >> >> > > > kafka >> >> >> >> >> > > > > > use >> >> >> >> >> > > > > > > > > >>case >> >> >> >> >> > > > > > > > > >>so if we can't make our clients to this >> >> >>probably no >> >> >> >> >>one >> >> >> >> >> > else >> >> >> >> >> > > > can. >> >> >> >> >> > > > > > > > > >> >> >> >> >> >> > > > > > > > > >>I think mm should just be N independent >> >>threads >> >> >> >>each >> >> >> >> >>of >> >> >> >> >> > which >> >> >> >> >> > > > has >> >> >> >> >> > > > > > their >> >> >> >> >> > > > > > > > > >>own >> >> >> >> >> > > > > > > > > >>consumer but share a producer and each of >> >>which >> >> >> >>looks >> >> >> >> >> like >> >> >> >> >> > > > this: >> >> >> >> >> > > > > > > > > >> >> >> >> >> >> > > > > > > > > >>while(true) { >> >> >> >> >> > > > > > > > > >> val recs = >>consumer.poll(Long.MaxValue); >> >> >> >> >> > > > > > > > > >> for (rec <- recs) >> >> >> >> >> > > > > > > > > >> producer.send(rec, >>logErrorCallback) >> >> >> >> >> > > > > > > > > >> if(System.currentTimeMillis - >>lastCommit >> >>> >> >> >> >> >> > commitInterval) >> >> >> >> > > > > { >> >> >> >> >> > > > > > > > > >> producer.flush() >> >> >> >> >> > > > > > > > > >> consumer.commit() >> >> >> >> >> > > > > > > > > >> lastCommit = >>System.currentTimeMillis >> >> >> >> >> > > > > > > > > >> } >> >> >> >> >> > > > > > > > > >>} >> >> >> >> >> > > > > > > > > >> >> >> >> >> >> > > > > > > > > >>This will depend on setting the retry >>count >> >>in >> >> >>the >> >> >> >> >> > producer to >> >> >> >> >> > > > > > > > something >> >> >> >> >> > > > > > > > > >>high with a largish backoff so that a >>failed >> >> >>send >> >> >> >> >>attempt >> >> >> >> >> > > > doesn't >> >> >> >> >> > > > > > drop >> >> >> >> >> > > > > > > > > >>data. >> >> >> >> >> > > > > > > > > >> >> >> >> >> >> > > > > > > > > >>We will need to use the callback to force >>a >> >> >>flush >> >> >> >>and >> >> >> >> >> > offset >> >> >> >> >> > > > > > commit on >> >> >> >> >> > > > > > > > > >>rebalance. >> >> >> >> >> > > > > > > > > >> >> >> >> >> >> > > > > > > > > >>This approach may have a few more TCP >> >> >>connections >> >> >> >>due >> >> >> >> >>to >> >> >> >> >> > using >> >> >> >> >> > > > > > multiple >> >> >> >> >> > > > > > > > > >>consumers but I think it is a lot easier >>to >> >> >>reason >> >> >> >> >>about >> >> >> >> >> > and >> >> >> >> >> > > > the >> >> >> >> >> > > > > > total >> >> >> >> >> > > > > > > > > >>number o mm instances is always going to >>be >> >> >>small. >> >> >> >> >> > > > > > > > > >> >> >> >> >> >> > > > > > > > > >>Let's talk about where this simple >>approach >> >> >>falls >> >> >> >> >>short, >> >> >> >> >> I >> >> >> >> >> > > > think >> >> >> >> >> > > > > > that >> >> >> >> >> > > > > > > > >>will >> >> >> >> >> > > > > > > > > >>help us understand your motivations for >> >> >>additional >> >> >> >> >> > elements. >> >> >> >> >> > > > > > > > >> >> >> >> >> >> > > > > > > > > >>Another advantage of this is that it is so >> >> >>simple I >> >> >> >> >>don't >> >> >> >> >> > > > think we >> >> >> >> >> > > > > > > > really >> >> >> >> >> > > > > > > > > >>even need to both making mm extensible >> >>because >> >> >> >>writing >> >> >> >> >> > your own >> >> >> >> >> > > > > > code >> >> >> >> >> > > > > > > > that >> >> >> >> >> > > > > > > > > >>does custom processing or transformation >>is >> >>just >> >> >> >>ten >> >> >> >> >> lines >> >> >> >> >> > and >> >> >> >> >> > > > no >> >> >> >> >> > > > > > plug >> >> >> >> >> > > > > > > > in >> >> >> >> >> > > > > > > > > >>system is going to make it simpler. >> >> >> >> >> > > > > > > > > >> >> >> >> >> >> > > > > > > > > >>-Jay >> >> >> >> >> > > > > > > > > >> >> >> >> >> >> > > > > > > > > >> >> >> >> >> >> > > > > > > > > >>On Sun, Feb 8, 2015 at 2:40 PM, Neha >> >>Narkhede < >> >> >> >> >> > > > n...@confluent.io> >> >> >> >> >> > > > > > > > wrote: >> >> >> >> >> > > > > > > > > >> >> >> >> >> >> > > > > > > > > >>> Few comments - >> >> >> >> >> > > > > > > > > >>> >> >> >> >> >> > > > > > > > > >>> 1. Why do we need the message handler? >>Do >> >>you >> >> >> >>have >> >> >> >> >> > concrete >> >> >> >> >> > > > use >> >> >> >> >> > > > > > cases >> >> >> >> >> > > > > > > > > >>>in >> >> >> >> >> > > > > > > > > >>> mind? If not, we should consider adding >>it >> >>in >> >> >>the >> >> >> >> >> future >> >> >> >> >> > > > when/if >> >> >> >> >> > > > > > we >> >> >> >> >> > > > > > > > do >> >> >> >> >> > > > > > > > > >>>have >> >> >> >> >> > > > > > > > > >>> use cases for it. The purpose of the >>mirror >> >> >>maker >> >> >> >> >>is a >> >> >> >> >> > simple >> >> >> >> >> > > > > > tool >> >> >> >> >> > > > > > > > for >> >> >> >> >> > > > > > > > > >>> setting up Kafka cluster replicas. I >>don't >> >>see >> >> >> >>why >> >> >> >> >>we >> >> >> >> >> > need to >> >> >> >> >> > > > > > > > include a >> >> >> >> >> > > > > > > > > >>> message handler for doing stream >> >> >>transformations >> >> >> >>or >> >> >> >> >> > > > filtering. >> >> >> >> >> > > > > > You >> >> >> >> >> > > > > > > > can >> >> >> >> >> > > > > > > > > >>> always write a simple process for doing >> >>that >> >> >>once >> >> >> >> >>the >> >> >> >> >> > data is >> >> >> >> >> > > > > > copied >> >> >> >> >> > > > > > > > as >> >> >> >> >> > > > > > > > > >>>is >> >> >> >> >> > > > > > > > > >>> in the target cluster >> >> >> >> >> > > > > > > > > >>> 2. Why keep both designs? We should >>prefer >> >>the >> >> >> >> >>simpler >> >> >> >> >> > design >> >> >> >> >> > > > > > unless >> >> >> >> >> > > > > > > > it >> >> >> >> >> > > > > > > > > >>>is >> >> >> >> >> > > > > > > > > >>> not feasible due to the performance >>issue >> >> >>that we >> >> >> >> >> > previously >> >> >> >> >> > > > > > had. Did >> >> >> >> >> > > > > > > > > >>>you >> >> >> >> >> > > > > > > > > >>> get a chance to run some tests to see if >> >>that >> >> >>is >> >> >> >> >>really >> >> >> >> >> > > > still a >> >> >> >> >> > > > > > > > problem >> >> >> >> >> > > > > > > > > >>>or >> >> >> >> >> > > > > > > > > >>> not? It will be easier to think about >>the >> >> >>design >> >> >> >>and >> >> >> >> >> also >> >> >> >> >> > > > make >> >> >> >> >> > > > > > the >> >> >> >> >> > > > > > > > KIP >> >> >> >> >> > > > > > > > > >>> complete if we make a call on the design >> >> >>first. >> >> >> >> >> > > > > > > > > >>> 3. Can you explain the need for keeping >>a >> >> >>list of >> >> >> >> >> unacked >> >> >> >> >> > > > > > offsets per >> >> >> >> >> > > > > > > > > >>> partition? Consider adding a section on >> >> >>retries >> >> >> >>and >> >> >> >> >>how >> >> >> >> >> > you >> >> >> >> >> > > > plan >> >> >> >> >> > > > > > to >> >> >> >> >> > > > > > > > > >>>handle >> >> >> >> >> > > > > > > > > >>> the case when the producer runs out of >>all >> >> >> >>retries. >> >> >> >> >> > > > > > > > > >>> >> >> >> >> >> > > > > > > > > >>> Thanks, >> >> >> >> >> > > > > > > > > >>> Neha >> >> >> >> >> > > > > > > > > >>> >> >> >> >> >> > > > > > > > > >>> On Sun, Feb 8, 2015 at 2:06 PM, Jiangjie >> >>Qin >> >> >> >> >> > > > > > > > > >>><j...@linkedin.com.invalid> >> >> >> >> >> > > > > > > > > >>> wrote: >> >> >> >> >> > > > > > > > > >>> >> >> >> >> >> > > > > > > > > >>> > Hi Neha, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> > > > > > > > > >>> > Yes, I’ve updated the KIP so the >>entire >> >>KIP >> >> >>is >> >> >> >> >>based >> >> >> >> >> > on new >> >> >> >> >> > > > > > > > consumer >> >> >> >> >> > > > > > > > > >>>now. >> >> >> >> >> > > > > > > > > >>> > I’ve put both designs with and without >> >>data >> >> >> >> >>channel >> >> >> >> >> in >> >> >> >> >> > the >> >> >> >> >> > > > KIP >> >> >> >> >> > > > > > as I >> >> >> >> >> > > > > > > > > >>>still >> >> >> >> >> > > > > > > > > >>> > feel we might need the data channel to >> >> >>provide >> >> >> >> >>more >> >> >> >> >> > > > > > flexibility, >> >> >> >> >> > > > > > > > > >>> > especially after message handler is >> >> >>introduced. >> >> >> >> >>I’ve >> >> >> >> >> > put my >> >> >> >> >> > > > > > > > thinking >> >> >> >> >> > > > > > > > > >>>of >> >> >> >> >> > > > > > > > > >>> > the pros and cons of the two designs >>in >> >>the >> >> >> >>KIP as >> >> >> >> >> > well. >> >> >> >> >> > > > It’ll >> >> >> >> >> > > > > > be >> >> >> >> >> > > > > > > > > >>>great >> >> >> >> >> > > > > > > > > >>> if >> >> >> >> >> > > > > > > > > >>> > you can give a review and comment. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> > > > > > > > > >>> > Thanks. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> > > > > > > > > >>> > Jiangjie (Becket) Qin >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> > > > > > > > > >>> > On 2/6/15, 7:30 PM, "Neha Narkhede" < >> >> >> >> >> n...@confluent.io >> >> >> >> >> > > >> >> >> >> >> > > > wrote: >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> > > > > > > > > >>> > >Hey Becket, >> >> >> >> >> > > > > > > > > >>> > > >> >> >> >> >> > > > > > > > > >>> > >What are the next steps on this KIP. >>As >> >>per >> >> >> >>your >> >> >> >> >> > comment >> >> >> >> >> > > > > > earlier >> >> >> >> >> > > > > > > > on >> >> >> >> >> > > > > > > > > >>>the >> >> >> >> >> > > > > > > > > >>> > >thread - >> >> >> >> >> > > > > > > > > >>> > > >> >> >> >> >> > > > > > > > > >>> > >I do agree it makes more sense >> >> >> >> >> > > > > > > > > >>> > >> to avoid duplicate effort and plan >> >>based >> >> >>on >> >> >> >>new >> >> >> >> >> > > > consumer. >> >> >> >> >> > > > > > I’ll >> >> >> >> >> > > > > > > > > >>>modify >> >> >> >> >> > > > > > > > > >>> > >>the >> >> >> >> >> > > > > > > > > >>> > >> KIP. >> >> >> >> >> > > > > > > > > >>> > > >> >> >> >> >> > > > > > > > > >>> > > >> >> >> >> >> > > > > > > > > >>> > >Did you get a chance to think about >>the >> >> >> >> >>simplified >> >> >> >> >> > design >> >> >> >> >> > > > > > that we >> >> >> >> >> > > > > > > > > >>> proposed >> >> >> >> >> > > > > > > > > >>> > >earlier? Do you plan to update the >>KIP >> >>with >> >> >> >>that >> >> >> >> >> > proposal? >> >> >> >> >> > > > > > > > > >>> > > >> >> >> >> >> > > > > > > > > >>> > >Thanks, >> >> >> >> >> > > > > > > > > >>> > >Neha >> >> >> >> >> > > > > > > > > >>> > > >> >> >> >> >> > > > > > > > > >>> > >On Wed, Feb 4, 2015 at 12:12 PM, >> >>Jiangjie >> >> >>Qin >> >> >> >> >> > > > > > > > > >>><j...@linkedin.com.invalid >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> > > > > > > > > >>> > >wrote: >> >> >> >> >> > > > > > > > > >>> > > >> >> >> >> >> > > > > > > > > >>> > >> In mirror maker we do not do >> >> >> >>de-serialization >> >> >> >> >>on >> >> >> >> >> the >> >> >> >> >> > > > > > messages. >> >> >> >> >> > > > > > > > > >>>Mirror >> >> >> >> >> > > > > > > > > >>> > >> maker use source TopicPartition >>hash >> >>to >> >> >> >>chose a >> >> >> >> >> > > > producer to >> >> >> >> >> > > > > > send >> >> >> >> >> > > > > > > > > >>> > >>messages >> >> >> >> >> > > > > > > > > >>> > >> from the same source partition. The >> >> >> >>partition >> >> >> >> >> those >> >> >> >> >> > > > > > messages end >> >> >> >> >> > > > > > > > > >>>up >> >> >> >> >> > > > > > > > > >>> with >> >> >> >> >> > > > > > > > > >>> > >> are decided by Partitioner class in >> >> >> >> >>KafkaProducer >> >> >> >> >> > > > (assuming >> >> >> >> >> > > > > > you >> >> >> >> >> > > > > > > > > >>>are >> >> >> >> >> > > > > > > > > >>> > >>using >> >> >> >> >> > > > > > > > > >>> > >> the new producer), which uses hash >> >>code >> >> >>of >> >> >> >> >> bytes[]. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> > > > > > > > > >>> > >> If deserialization is needed, it >>has >> >>to >> >> >>be >> >> >> >> >>done in >> >> >> >> >> > > > message >> >> >> >> >> > > > > > > > > >>>handler. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> > > > > > > > > >>> > >> Thanks. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> > > > > > > > > >>> > >> Jiangjie (Becket) Qin >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> > > > > > > > > >>> > >> On 2/4/15, 11:33 AM, "Bhavesh >>Mistry" >> >>< >> >> >> >> >> > > > > > > > mistry.p.bhav...@gmail.com> >> >> >> >> >> > > > > > > > > >>> > >>wrote: >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> > > > > > > > > >>> > >> >Hi Jiangjie, >> >> >> >> >> > > > > > > > > >>> > >> > >> >> >> >> >> > > > > > > > > >>> > >> >Thanks for entertaining my >>question >> >>so >> >> >>far. >> >> >> >> >>Last >> >> >> >> >> > > > > > question, I >> >> >> >> >> > > > > > > > > >>>have is >> >> >> >> >> > > > > > > > > >>> > >> >about >> >> >> >> >> > > > > > > > > >>> > >> >serialization of message key. If >>the >> >> >>key >> >> >> >> >> > > > de-serialization >> >> >> >> >> > > > > > > > > >>>(Class) is >> >> >> >> >> > > > > > > > > >>> > >>not >> >> >> >> >> > > > > > > > > >>> > >> >present at the MM instance, then >> >>does it >> >> >> >>use >> >> >> >> >>raw >> >> >> >> >> > byte >> >> >> >> >> > > > > > hashcode >> >> >> >> >> > > > > > > > to >> >> >> >> >> > > > > > > > > >>> > >> >determine >> >> >> >> >> > > > > > > > > >>> > >> >the partition ? How are you >>going to >> >> >> >>address >> >> >> >> >>the >> >> >> >> >> > > > situation >> >> >> >> >> > > > > > > > where >> >> >> >> >> > > > > > > > > >>>key >> >> >> >> >> > > > > > > > > >>> > >> >needs >> >> >> >> >> > > > > > > > > >>> > >> >to be de-serialization and get >>actual >> >> >> >>hashcode >> >> >> >> >> > needs >> >> >> >> >> > > > to be >> >> >> >> >> > > > > > > > > >>>computed >> >> >> >> >> > > > > > > > > >>> ?. >> >> >> >> >> > > > > > > > > >>> > >> > >> >> >> >> >> > > > > > > > > >>> > >> > >> >> >> >> >> > > > > > > > > >>> > >> >Thanks, >> >> >> >> >> > > > > > > > > >>> > >> > >> >> >> >> >> > > > > > > > > >>> > >> >Bhavesh >> >> >> >> >> > > > > > > > > >>> > >> > >> >> >> >> >> > > > > > > > > >>> > >> >On Fri, Jan 30, 2015 at 1:41 PM, >> >> >>Jiangjie >> >> >> >>Qin >> >> >> >> >> > > > > > > > > >>> > >><j...@linkedin.com.invalid> >> >> >> >> >> > > > > > > > > >>> > >> >wrote: >> >> >> >> >> > > > > > > > > >>> > >> > >> >> >> >> >> > > > > > > > > >>> > >> >> Hi Bhavesh, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> Please see inline comments. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> Jiangjie (Becket) Qin >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> On 1/29/15, 7:00 PM, "Bhavesh >> >>Mistry" >> >> >> >> >> > > > > > > > > >>><mistry.p.bhav...@gmail.com> >> >> >> >> >> > > > > > > > > >>> > >> >>wrote: >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >Hi Jiangjie, >> >> >> >> >> > > > > > > > > >>> > >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >Thanks for the input. >> >> >> >> >> > > > > > > > > >>> > >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >a) Is MM will producer ack >>will >> >>be >> >> >> >>attach >> >> >> >> >>to >> >> >> >> >> > > > Producer >> >> >> >> >> > > > > > > > > >>>Instance or >> >> >> >> >> > > > > > > > > >>> > >>per >> >> >> >> >> > > > > > > > > >>> > >> >> >topic. Use case is that one >> >>instance >> >> >> >>of MM >> >> >> >> >> > > > > > > > > >>> > >> >> >needs to handle both strong ack >> >>and >> >> >>also >> >> >> >> >>ack=0 >> >> >> >> >> > for >> >> >> >> >> > > > some >> >> >> >> >> > > > > > > > topic. >> >> >> >> >> > > > > > > > > >>> Or >> >> >> >> >> > > > > > > > > >>> > >>it >> >> >> >> >> > > > > > > > > >>> > >> >> >would >> >> >> >> >> > > > > > > > > >>> > >> >> >be better to set-up another >> >>instance >> >> >>of >> >> >> >>MM. >> >> >> >> >> > > > > > > > > >>> > >> >> The acks setting is producer >>level >> >> >> >>setting >> >> >> >> >> > instead of >> >> >> >> >> > > > > > topic >> >> >> >> >> > > > > > > > > >>>level >> >> >> >> >> > > > > > > > > >>> > >> >>setting. >> >> >> >> >> > > > > > > > > >>> > >> >> In this case you probably need >>to >> >>set >> >> >>up >> >> >> >> >> another >> >> >> >> >> > > > > > instance. >> >> >> >> >> > > > > > > > > >>> > >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >b) Regarding TCP connections, >>Why >> >> >>does >> >> >> >> >> #producer >> >> >> >> >> > > > > > instance >> >> >> >> >> > > > > > > > > >>>attach >> >> >> >> >> > > > > > > > > >>> to >> >> >> >> >> > > > > > > > > >>> > >>TCP >> >> >> >> >> > > > > > > > > >>> > >> >> >connection. Is it possible to >>use >> >> >> >>Broker >> >> >> >> >> > > > Connection TCP >> >> >> >> >> > > > > > > > Pool, >> >> >> >> >> > > > > > > > > >>> > >>producer >> >> >> >> >> > > > > > > > > >>> > >> >> >will just checkout TCP >>connection >> >> to >> >> >> >> >>Broker. >> >> >> >> >> > So, >> >> >> >> >> > > > # of >> >> >> >> >> > > > > > > > > >>>Producer >> >> >> >> >> > > > > > > > > >>> > >> >>Instance >> >> >> >> >> > > > > > > > > >>> > >> >> >does not correlation to Brokers >> >> >> >>Connection. >> >> >> >> >> Is >> >> >> >> >> > this >> >> >> >> >> > > > > > > > possible >> >> >> >> >> > > > > > > > > >>>? >> >> >> >> >> > > > > > > > > >>> > >> >> In new producer, each producer >> >> >>maintains >> >> >> >>a >> >> >> >> >> > > > connection to >> >> >> >> >> > > > > > each >> >> >> >> >> > > > > > > > > >>> broker >> >> >> >> >> > > > > > > > > >>> > >> >> within the producer instance. >> >>Making >> >> >> >> >>producer >> >> >> >> >> > > > instances >> >> >> >> >> > > > > > to >> >> >> >> >> > > > > > > > > >>>share >> >> >> >> >> > > > > > > > > >>> the >> >> >> >> >> > > > > > > > > >>> > >>TCP >> >> >> >> >> > > > > > > > > >>> > >> >> connections is a very big >>change to >> >> >>the >> >> >> >> >>current >> >> >> >> >> > > > design, >> >> >> >> >> > > > > > so I >> >> >> >> >> > > > > > > > > >>> suppose >> >> >> >> >> > > > > > > > > >>> > >>we >> >> >> >> >> > > > > > > > > >>> > >> >> won’t be able to do that. >> >> >> >> >> > > > > > > > > >>> > >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >Thanks, >> >> >> >> >> > > > > > > > > >>> > >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >Bhavesh >> >> >> >> >> > > > > > > > > >>> > >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >On Thu, Jan 29, 2015 at 11:50 >>AM, >> >> >> >>Jiangjie >> >> >> >> >>Qin >> >> >> >> >> > > > > > > > > >>> > >> >><j...@linkedin.com.invalid >> >> >> >> >> > > > > > > > > >>> > >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >wrote: >> >> >> >> >> > > > > > > > > >>> > >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> Hi Bhavesh, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> I think it is the right >> >>discussion >> >> >>to >> >> >> >> >>have >> >> >> >> >> > when >> >> >> >> >> > > > we are >> >> >> >> >> > > > > > > > > >>>talking >> >> >> >> >> > > > > > > > > >>> > >>about >> >> >> >> >> > > > > > > > > >>> > >> >>the >> >> >> >> >> > > > > > > > > >>> > >> >> >> new new design for MM. >> >> >> >> >> > > > > > > > > >>> > >> >> >> Please see the inline >>comments. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> Jiangjie (Becket) Qin >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> On 1/28/15, 10:48 PM, >>"Bhavesh >> >> >>Mistry" >> >> >> >> >> > > > > > > > > >>> > >><mistry.p.bhav...@gmail.com> >> >> >> >> >> > > > > > > > > >>> > >> >> >>wrote: >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >Hi Jiangjie, >> >> >> >> >> > > > > > > > > >>> > >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >I just wanted to let you >>know >> >> >>about >> >> >> >>our >> >> >> >> >>use >> >> >> >> >> > case >> >> >> >> >> > > > and >> >> >> >> >> > > > > > > > stress >> >> >> >> >> > > > > > > > > >>>the >> >> >> >> >> > > > > > > > > >>> > >> >>point >> >> >> >> >> > > > > > > > > >>> > >> >> >>that >> >> >> >> >> > > > > > > > > >>> > >> >> >> >local data center broker >> >>cluster >> >> >>have >> >> >> >> >>fewer >> >> >> >> >> > > > > > partitions >> >> >> >> >> > > > > > > > than >> >> >> >> >> > > > > > > > > >>>the >> >> >> >> >> > > > > > > > > >>> > >> >> >> >destination >> >> >> >> >> > > > > > > > > >>> > >> >> >> >offline broker cluster. Just >> >> >>because >> >> >> >>we >> >> >> >> >>do >> >> >> >> >> > the >> >> >> >> >> > > > batch >> >> >> >> >> > > > > > pull >> >> >> >> >> > > > > > > > > >>>from >> >> >> >> >> > > > > > > > > >>> > >>CAMUS >> >> >> >> >> > > > > > > > > >>> > >> >> >>and >> >> >> >> >> > > > > > > > > >>> > >> >> >> >in >> >> >> >> >> > > > > > > > > >>> > >> >> >> >order to drain data faster >>than >> >> >>the >> >> >> >> >> injection >> >> >> >> >> > > > rate >> >> >> >> >> > > > > > (from >> >> >> >> >> > > > > > > > > >>>four >> >> >> >> >> > > > > > > > > >>> DCs >> >> >> >> >> > > > > > > > > >>> > >> >>for >> >> >> >> >> > > > > > > > > >>> > >> >> >>same >> >> >> >> >> > > > > > > > > >>> > >> >> >> >topic). >> >> >> >> >> > > > > > > > > >>> > >> >> >> Keeping the same partition >> >>number >> >> >>in >> >> >> >> >>source >> >> >> >> >> > and >> >> >> >> >> > > > target >> >> >> >> >> > > > > > > > > >>>cluster >> >> >> >> >> > > > > > > > > >>> > >>will >> >> >> >> >> > > > > > > > > >>> > >> >>be >> >> >> >> >> > > > > > > > > >>> > >> >> >>an >> >> >> >> >> > > > > > > > > >>> > >> >> >> option but will not be >>enforced >> >>by >> >> >> >> >>default. >> >> >> >> >> > > > > > > > > >>> > >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >We are facing following >>issues >> >> >> >>(probably >> >> >> >> >> due >> >> >> >> >> > to >> >> >> >> >> > > > > > > > > >>>configuration): >> >> >> >> >> > > > > > > > > >>> > >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >1) We occasionally >>loose >> >>data >> >> >> >>due >> >> >> >> >>to >> >> >> >> >> > message >> >> >> >> >> > > > > > batch >> >> >> >> >> > > > > > > > > >>>size is >> >> >> >> >> > > > > > > > > >>> > >>too >> >> >> >> >> > > > > > > > > >>> > >> >> >>large >> >> >> >> >> > > > > > > > > >>> > >> >> >> >(2MB) on target data (we are >> >>using >> >> >> >>old >> >> >> >> >> > producer >> >> >> >> >> > > > but I >> >> >> >> >> > > > > > > > think >> >> >> >> >> > > > > > > > > >>>new >> >> >> >> >> > > > > > > > > >>> > >> >> >>producer >> >> >> >> >> > > > > > > > > >>> > >> >> >> >will solve this problem to >>some >> >> >> >>extend). >> >> >> >> >> > > > > > > > > >>> > >> >> >> We do see this issue in >> >>LinkedIn as >> >> >> >>well. >> >> >> >> >> New >> >> >> >> >> > > > producer >> >> >> >> >> > > > > > > > also >> >> >> >> >> > > > > > > > > >>> might >> >> >> >> >> > > > > > > > > >>> > >> >>have >> >> >> >> >> > > > > > > > > >>> > >> >> >> this issue. There are some >> >> >>proposal of >> >> >> >> >> > solutions, >> >> >> >> >> > > > but >> >> >> >> >> > > > > > no >> >> >> >> >> > > > > > > > > >>>real >> >> >> >> >> > > > > > > > > >>> work >> >> >> >> >> > > > > > > > > >>> > >> >> >>started >> >> >> >> >> > > > > > > > > >>> > >> >> >> yet. For now, as a >>workaround, >> >> >> >>setting a >> >> >> >> >> more >> >> >> >> >> > > > > > aggressive >> >> >> >> >> > > > > > > > > >>>batch >> >> >> >> >> > > > > > > > > >>> > >>size >> >> >> >> >> > > > > > > > > >>> > >> >>on >> >> >> >> >> > > > > > > > > >>> > >> >> >> producer side should work. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >2) Since only one >> >>instance is >> >> >> >>set >> >> >> >> >>to >> >> >> >> >> MM >> >> >> >> >> > > > data, >> >> >> >> >> > > > > > we >> >> >> >> >> > > > > > > > are >> >> >> >> >> > > > > > > > > >>>not >> >> >> >> >> > > > > > > > > >>> > >>able >> >> >> >> >> > > > > > > > > >>> > >> >>to >> >> >> >> >> > > > > > > > > >>> > >> >> >> >set-up ack per topic instead >> >>ack >> >> >>is >> >> >> >> >> attached >> >> >> >> >> > to >> >> >> >> >> > > > > > producer >> >> >> >> >> > > > > > > > > >>> > >>instance. >> >> >> >> >> > > > > > > > > >>> > >> >> >> I don’t quite get the >>question >> >> >>here. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >3) How are you going to >> >> >>address >> >> >> >>two >> >> >> >> >> > phase >> >> >> >> >> > > > commit >> >> >> >> >> > > > > > > > > >>>problem >> >> >> >> >> > > > > > > > > >>> if >> >> >> >> >> > > > > > > > > >>> > >> >>ack is >> >> >> >> >> > > > > > > > > >>> > >> >> >> >set >> >> >> >> >> > > > > > > > > >>> > >> >> >> >to strongest, but auto >>commit >> >>is >> >> >>on >> >> >> >>for >> >> >> >> >> > consumer >> >> >> >> >> > > > > > (meaning >> >> >> >> >> > > > > > > > > >>> > >>producer >> >> >> >> >> > > > > > > > > >>> > >> >>does >> >> >> >> >> > > > > > > > > >>> > >> >> >> >not >> >> >> >> >> > > > > > > > > >>> > >> >> >> >get ack, but consumer auto >> >> >>committed >> >> >> >> >> offset >> >> >> >> >> > that >> >> >> >> >> > > > > > > > message). >> >> >> >> >> > > > > > > > > >>> Is >> >> >> >> >> > > > > > > > > >>> > >> >>there >> >> >> >> >> > > > > > > > > >>> > >> >> >> >transactional (Kafka >> >>transaction >> >> >>is >> >> >> >>in >> >> >> >> >> > process) >> >> >> >> >> > > > > > based ack >> >> >> >> >> > > > > > > > > >>>and >> >> >> >> >> > > > > > > > > >>> > >>commit >> >> >> >> >> > > > > > > > > >>> > >> >> >> >offset >> >> >> >> >> > > > > > > > > >>> > >> >> >> >? >> >> >> >> >> > > > > > > > > >>> > >> >> >> Auto offset commit should be >> >>turned >> >> >> >>off >> >> >> >> >>in >> >> >> >> >> > this >> >> >> >> >> > > > case. >> >> >> >> >> > > > > > The >> >> >> >> >> > > > > > > > > >>>offset >> >> >> >> >> > > > > > > > > >>> > >>will >> >> >> >> >> > > > > > > > > >>> > >> >> >>only >> >> >> >> >> > > > > > > > > >>> > >> >> >> be committed once by the >>offset >> >> >>commit >> >> >> >> >> > thread. So >> >> >> >> >> > > > > > there is >> >> >> >> >> > > > > > > > > >>>no >> >> >> >> >> > > > > > > > > >>> two >> >> >> >> >> > > > > > > > > >>> > >> >>phase >> >> >> >> >> > > > > > > > > >>> > >> >> >> commit. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >4) How are you >>planning to >> >> >>avoid >> >> >> >> >> > duplicated >> >> >> >> >> > > > > > message? >> >> >> >> >> > > > > > > > > >>>( Is >> >> >> >> >> > > > > > > > > >>> > >> >> >> >brokergoing >> >> >> >> >> > > > > > > > > >>> > >> >> >> >have moving window of >>message >> >> >> >>collected >> >> >> >> >>and >> >> >> >> >> > > > de-dupe >> >> >> >> >> > > > > > ?) >> >> >> >> >> > > > > > > > > >>> > >>Possibly, we >> >> >> >> >> > > > > > > > > >>> > >> >> >>get >> >> >> >> >> > > > > > > > > >>> > >> >> >> >this from retry set to 5…? >> >> >> >> >> > > > > > > > > >>> > >> >> >> We are not trying to >>completely >> >> >>avoid >> >> >> >> >> > duplicates. >> >> >> >> >> > > > The >> >> >> >> >> > > > > > > > > >>>duplicates >> >> >> >> >> > > > > > > > > >>> > >>will >> >> >> >> >> > > > > > > > > >>> > >> >> >> still be there if: >> >> >> >> >> > > > > > > > > >>> > >> >> >> 1. Producer retries on >>failure. >> >> >> >> >> > > > > > > > > >>> > >> >> >> 2. Mirror maker is hard >>killed. >> >> >> >> >> > > > > > > > > >>> > >> >> >> Currently, dedup is expected >>to >> >>be >> >> >> >>done >> >> >> >> >>by >> >> >> >> >> > user if >> >> >> >> >> > > > > > > > > >>>necessary. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >5) Last, is there any >> >> >>warning or >> >> >> >> >>any >> >> >> >> >> > thing >> >> >> >> >> > > > you >> >> >> >> >> > > > > > can >> >> >> >> >> > > > > > > > > >>>provide >> >> >> >> >> > > > > > > > > >>> > >> >>insight >> >> >> >> >> > > > > > > > > >>> > >> >> >> >from MM component about data >> >> >> >>injection >> >> >> >> >>rate >> >> >> >> >> > into >> >> >> >> >> > > > > > > > > >>>destination >> >> >> >> >> > > > > > > > > >>> > >> >> >>partitions is >> >> >> >> >> > > > > > > > > >>> > >> >> >> >NOT evenly distributed >> >>regardless >> >> >> of >> >> >> >> >> keyed >> >> >> >> >> > or >> >> >> >> >> > > > > > non-keyed >> >> >> >> >> > > > > > > > > >>> message >> >> >> >> >> > > > > > > > > >>> > >> >> >>(Hence >> >> >> >> >> > > > > > > > > >>> > >> >> >> >there is ripple effect such >>as >> >> >>data >> >> >> >>not >> >> >> >> >> > arriving >> >> >> >> >> > > > > > late, or >> >> >> >> >> > > > > > > > > >>>data >> >> >> >> >> > > > > > > > > >>> is >> >> >> >> >> > > > > > > > > >>> > >> >> >>arriving >> >> >> >> >> > > > > > > > > >>> > >> >> >> >out of order in intern of >>time >> >> >>stamp >> >> >> >> >>and >> >> >> >> >> > early >> >> >> >> >> > > > some >> >> >> >> >> > > > > > > > time, >> >> >> >> >> > > > > > > > > >>>and >> >> >> >> >> > > > > > > > > >>> > >> >>CAMUS >> >> >> >> >> > > > > > > > > >>> > >> >> >> >creates huge number of file >> >>count >> >> >>on >> >> >> >> >>HDFS >> >> >> >> >> > due to >> >> >> >> >> > > > > > uneven >> >> >> >> >> > > > > > > > > >>> injection >> >> >> >> >> > > > > > > > > >>> > >> >>rate >> >> >> >> >> > > > > > > > > >>> > >> >> >>. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >Camus Job is configured to >>run >> >> >> >>every 3 >> >> >> >> >> > minutes.) >> >> >> >> >> > > > > > > > > >>> > >> >> >> I think uneven data >> >>distribution is >> >> >> >> >> typically >> >> >> >> >> > > > caused >> >> >> >> >> > > > > > by >> >> >> >> >> > > > > > > > > >>>server >> >> >> >> >> > > > > > > > > >>> > >>side >> >> >> >> >> > > > > > > > > >>> > >> >> >> unbalance, instead of >>something >> >> >>mirror >> >> >> >> >>maker >> >> >> >> >> > could >> >> >> >> >> > > > > > > > control. >> >> >> >> >> > > > > > > > > >>>In >> >> >> >> >> > > > > > > > > >>> new >> >> >> >> >> > > > > > > > > >>> > >> >> >>mirror >> >> >> >> >> > > > > > > > > >>> > >> >> >> maker, however, there is a >> >> >> >>customizable >> >> >> >> >> > message >> >> >> >> >> > > > > > handler, >> >> >> >> >> > > > > > > > > >>>that >> >> >> >> >> > > > > > > > > >>> > >>might >> >> >> >> >> > > > > > > > > >>> > >> >>be >> >> >> >> >> > > > > > > > > >>> > >> >> >> able to help a little bit. In >> >> >>message >> >> >> >> >> handler, >> >> >> >> >> > > > you can >> >> >> >> >> > > > > > > > > >>> explicitly >> >> >> >> >> > > > > > > > > >>> > >> >>set a >> >> >> >> >> > > > > > > > > >>> > >> >> >> partition that you want to >> >>produce >> >> >>the >> >> >> >> >> message >> >> >> >> >> > > > to. So >> >> >> >> >> > > > > > if >> >> >> >> >> > > > > > > > you >> >> >> >> >> > > > > > > > > >>> know >> >> >> >> >> > > > > > > > > >>> > >>the >> >> >> >> >> > > > > > > > > >>> > >> >> >> uneven data distribution in >> >>target >> >> >> >> >>cluster, >> >> >> >> >> > you >> >> >> >> >> > > > may >> >> >> >> >> > > > > > offset >> >> >> >> >> > > > > > > > > >>>it >> >> >> >> >> > > > > > > > > >>> > >>here. >> >> >> >> >> > > > > > > > > >>> > >> >>But >> >> >> >> >> > > > > > > > > >>> > >> >> >> that probably only works for >> >> >>non-keyed >> >> >> >> >> > messages. >> >> >> >> >> > > > > > > > > >>> > >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >I am not sure if this is >>right >> >> >> >> >>discussion >> >> >> >> >> > form to >> >> >> >> >> > > > > > bring >> >> >> >> >> > > > > > > > > >>>these >> >> >> >> >> > > > > > > > > >>> to >> >> >> >> >> > > > > > > > > >>> > >> >> >> >your/kafka >> >> >> >> >> > > > > > > > > >>> > >> >> >> >Dev team attention. This >> >>might be >> >> >> >>off >> >> >> >> >> track, >> >> >> >> >> > > > > > > > > >>> > >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >Thanks, >> >> >> >> >> > > > > > > > > >>> > >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >Bhavesh >> >> >> >> >> > > > > > > > > >>> > >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >On Wed, Jan 28, 2015 at >>11:07 >> >>AM, >> >> >> >> >>Jiangjie >> >> >> >> >> > Qin >> >> >> >> >> > > > > > > > > >>> > >> >> >><j...@linkedin.com.invalid >> >> >> >> >> > > > > > > > > >>> > >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >wrote: >> >> >> >> >> > > > > > > > > >>> > >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> I’ve updated the KIP page. >> >> >> >>Feedbacks >> >> >> >> >>are >> >> >> >> >> > > > welcome. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> Regarding the simple >>mirror >> >> >>maker >> >> >> >> >> design. I >> >> >> >> >> > > > thought >> >> >> >> >> > > > > > > > over >> >> >> >> >> > > > > > > > > >>>it >> >> >> >> >> > > > > > > > > >>> and >> >> >> >> >> > > > > > > > > >>> > >> >>have >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>some >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> worries: >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> There are two things that >> >>might >> >> >> >>worth >> >> >> >> >> > thinking: >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> 1. One of the enhancement >>to >> >> >>mirror >> >> >> >> >>maker >> >> >> >> >> > is >> >> >> >> >> > > > > > adding a >> >> >> >> >> > > > > > > > > >>>message >> >> >> >> >> > > > > > > > > >>> > >> >> >>handler to >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> do things like >>reformatting. >> >>I >> >> >> >>think >> >> >> >> >>we >> >> >> >> >> > might >> >> >> >> >> > > > > > > > potentially >> >> >> >> >> > > > > > > > > >>> want >> >> >> >> >> > > > > > > > > >>> > >>to >> >> >> >> >> > > > > > > > > >>> > >> >> >>have >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> more threads processing >>the >> >> >> >>messages >> >> >> >> >>than >> >> >> >> >> > the >> >> >> >> >> > > > > > number of >> >> >> >> >> > > > > > > > > >>> > >>consumers. >> >> >> >> >> > > > > > > > > >>> > >> >> >>If we >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> follow the simple mirror >> >>maker >> >> >> >> >>solution, >> >> >> >> >> we >> >> >> >> >> > > > lose >> >> >> >> >> > > > > > this >> >> >> >> >> > > > > > > > > >>> > >>flexibility. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> 2. This might not matter >>too >> >> >>much, >> >> >> >>but >> >> >> >> >> > creating >> >> >> >> >> > > > > > more >> >> >> >> >> > > > > > > > > >>> consumers >> >> >> >> >> > > > > > > > > >>> > >> >>means >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>more >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> footprint of TCP >>connection / >> >> >> >>memory. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> Any thoughts on this? >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> Thanks. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> Jiangjie (Becket) Qin >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> On 1/26/15, 10:35 AM, >> >>"Jiangjie >> >> >> >>Qin" < >> >> >> >> >> > > > > > > > j...@linkedin.com> >> >> >> >> >> > > > > > > > > >>> > wrote: >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >Hi Jay and Neha, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >Thanks a lot for the >>reply >> >>and >> >> >> >> >> > explanation. I >> >> >> >> >> > > > do >> >> >> >> >> > > > > > agree >> >> >> >> >> > > > > > > > > >>>it >> >> >> >> >> > > > > > > > > >>> > >>makes >> >> >> >> >> > > > > > > > > >>> > >> >>more >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>sense >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >to avoid duplicate effort >> >>and >> >> >>plan >> >> >> >> >>based >> >> >> >> >> > on >> >> >> >> >> > > > new >> >> >> >> >> > > > > > > > > >>>consumer. >> >> >> >> >> > > > > > > > > >>> I’ll >> >> >> >> >> > > > > > > > > >>> > >> >> >>modify >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>the >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >KIP. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >To Jay’s question on >>message >> >> >> >> >>ordering - >> >> >> >> >> > The >> >> >> >> >> > > > data >> >> >> >> >> > > > > > > > channel >> >> >> >> >> > > > > > > > > >>> > >> >>selection >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>makes >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >sure that the messages >>from >> >>the >> >> >> >>same >> >> >> >> >> > source >> >> >> >> >> > > > > > partition >> >> >> >> >> > > > > > > > > >>>will >> >> >> >> >> > > > > > > > > >>> > >>sent >> >> >> >> >> > > > > > > > > >>> > >> >>by >> >> >> >> >> > > > > > > > > >>> > >> >> >>the >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >same producer. So the >>order >> >>of >> >> >>the >> >> >> >> >> > messages is >> >> >> >> >> > > > > > > > > >>>guaranteed >> >> >> >> >> > > > > > > > > >>> with >> >> >> >> >> > > > > > > > > >>> > >> >> >>proper >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >producer settings >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >>>>(MaxInFlightRequests=1,retries=Integer.MaxValue, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>etc.) >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >For keyed messages, >>because >> >> >>they >> >> >> >>come >> >> >> >> >> > from the >> >> >> >> >> > > > > > same >> >> >> >> >> > > > > > > > > >>>source >> >> >> >> >> > > > > > > > > >>> > >> >>partition >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>and >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >will end up in the same >> >>target >> >> >> >> >> partition, >> >> >> >> >> > as >> >> >> >> >> > > > long >> >> >> >> >> > > > > > as >> >> >> >> >> > > > > > > > > >>>they >> >> >> >> >> > > > > > > > > >>> are >> >> >> >> >> > > > > > > > > >>> > >> >>sent >> >> >> >> >> > > > > > > > > >>> > >> >> >>by >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>the >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >same producer, the order >>is >> >> >> >> >>guaranteed. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >For non-keyed messages, >>the >> >> >> >>messages >> >> >> >> >> > coming >> >> >> >> >> > > > from >> >> >> >> >> > > > > > the >> >> >> >> >> > > > > > > > > >>>same >> >> >> >> >> > > > > > > > > >>> > >>source >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>partition >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >might go to different >>target >> >> >> >> >>partitions. >> >> >> >> >> > The >> >> >> >> >> > > > > > order is >> >> >> >> >> > > > > > > > > >>>only >> >> >> >> >> > > > > > > > > >>> > >> >> >>guaranteed >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >within each partition. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >Anyway, I’ll modify the >>KIP >> >>and >> >> >> >>data >> >> >> >> >> > channel >> >> >> >> >> > > > will >> >> >> >> >> > > > > > be >> >> >> >> >> > > > > > > > > >>>away. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >Thanks. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >Jiangjie (Becket) Qin >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >On 1/25/15, 4:34 PM, >>"Neha >> >> >> >>Narkhede" >> >> >> >> >>< >> >> >> >> >> > > > > > > > n...@confluent.io> >> >> >> >> >> > > > > > > > > >>> > >>wrote: >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>I think there is some >> >>value in >> >> >> >> >> > investigating >> >> >> >> >> > > > if >> >> >> >> >> > > > > > we >> >> >> >> >> > > > > > > > can >> >> >> >> >> > > > > > > > > >>>go >> >> >> >> >> > > > > > > > > >>> > >>back >> >> >> >> >> > > > > > > > > >>> > >> >>to >> >> >> >> >> > > > > > > > > >>> > >> >> >>the >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>simple mirror maker >> >>design, as >> >> >> >>Jay >> >> >> >> >> points >> >> >> >> >> > > > out. >> >> >> >> >> > > > > > Here >> >> >> >> >> > > > > > > > you >> >> >> >> >> > > > > > > > > >>> have >> >> >> >> >> > > > > > > > > >>> > >>N >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>threads, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>each has a consumer and >>a >> >> >> >>producer. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>The reason why we had to >> >>move >> >> >> >>away >> >> >> >> >>from >> >> >> >> >> > that >> >> >> >> >> > > > was >> >> >> >> >> > > > > > a >> >> >> >> >> > > > > > > > > >>> > >>combination >> >> >> >> >> > > > > > > > > >>> > >> >>of >> >> >> >> >> > > > > > > > > >>> > >> >> >>the >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>difference in throughput >> >> >>between >> >> >> >>the >> >> >> >> >> > consumer >> >> >> >> >> > > > > > and the >> >> >> >> >> > > > > > > > > >>>old >> >> >> >> >> > > > > > > > > >>> > >> >>producer >> >> >> >> >> > > > > > > > > >>> > >> >> >>and >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>the >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>deficiency of the >>consumer >> >> >> >> >>rebalancing >> >> >> >> >> > that >> >> >> >> >> > > > > > limits >> >> >> >> >> > > > > > > > the >> >> >> >> >> > > > > > > > > >>> total >> >> >> >> >> > > > > > > > > >>> > >> >> >>number of >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>mirror maker threads. So >> >>the >> >> >>only >> >> >> >> >> option >> >> >> >> >> > > > > > available >> >> >> >> >> > > > > > > > was >> >> >> >> >> > > > > > > > > >>>to >> >> >> >> >> > > > > > > > > >>> > >> >>increase >> >> >> >> >> > > > > > > > > >>> > >> >> >>the >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>throughput of the >>limited >> >># of >> >> >> >> >>mirror >> >> >> >> >> > maker >> >> >> >> >> > > > > > threads >> >> >> >> >> > > > > > > > > >>>that >> >> >> >> >> > > > > > > > > >>> > >>could >> >> >> >> >> > > > > > > > > >>> > >> >>be >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>deployed. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>Now that queuing design >>may >> >> >>not >> >> >> >>make >> >> >> >> >> > sense, >> >> >> >> >> > > > if >> >> >> >> >> > > > > > the >> >> >> >> >> > > > > > > > new >> >> >> >> >> > > > > > > > > >>> > >> >>producer's >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>throughput is almost >> >>similar >> >> >>to >> >> >> >>the >> >> >> >> >> > consumer >> >> >> >> >> > > > AND >> >> >> >> >> > > > > > the >> >> >> >> >> > > > > > > > > >>>fact >> >> >> >> >> > > > > > > > > >>> > >>that >> >> >> >> >> > > > > > > > > >>> > >> >>the >> >> >> >> >> > > > > > > > > >>> > >> >> >>new >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>round-robin based >>consumer >> >> >> >> >>rebalancing >> >> >> >> >> > can >> >> >> >> >> > > > allow >> >> >> >> >> > > > > > a >> >> >> >> >> > > > > > > > very >> >> >> >> >> > > > > > > > > >>> high >> >> >> >> >> > > > > > > > > >>> > >> >> >>number of >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>mirror maker instances >>to >> >> >>exist. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>This is the end state >>that >> >>the >> >> >> >> >>mirror >> >> >> >> >> > maker >> >> >> >> >> > > > > > should be >> >> >> >> >> > > > > > > > > >>>in >> >> >> >> >> > > > > > > > > >>> once >> >> >> >> >> > > > > > > > > >>> > >> >>the >> >> >> >> >> > > > > > > > > >>> > >> >> >>new >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>consumer is complete, >>so it >> >> >> >>wouldn't >> >> >> >> >> > hurt to >> >> >> >> >> > > > see >> >> >> >> >> > > > > > if >> >> >> >> >> > > > > > > > we >> >> >> >> >> > > > > > > > > >>>can >> >> >> >> >> > > > > > > > > >>> > >>just >> >> >> >> >> > > > > > > > > >>> > >> >> >>move >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>to >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>that right now. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>On Fri, Jan 23, 2015 at >> >>8:40 >> >> >>PM, >> >> >> >>Jay >> >> >> >> >> > Kreps >> >> >> >> >> > > > > > > > > >>> > >><jay.kr...@gmail.com >> >> >> >> >> > > > > > > > > >>> > >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>wrote: >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> QQ: If we ever use a >> >> >>different >> >> >> >> >> > technique >> >> >> >> >> > > > for >> >> >> >> >> > > > > > the >> >> >> >> >> > > > > > > > data >> >> >> >> >> > > > > > > > > >>> > >>channel >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>selection >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> than for the producer >> >> >> >>partitioning >> >> >> >> >> > won't >> >> >> >> >> > > > that >> >> >> >> >> > > > > > break >> >> >> >> >> > > > > > > > > >>> > >>ordering? >> >> >> >> >> > > > > > > > > >>> > >> >>How >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>can >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>we >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> ensure these things >>stay >> >>in >> >> >> >>sync? >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> With respect to the >>new >> >> >> >> >>consumer--I >> >> >> >> >> > really >> >> >> >> >> > > > do >> >> >> >> >> > > > > > want >> >> >> >> >> > > > > > > > to >> >> >> >> >> > > > > > > > > >>> > >> >>encourage >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>people >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>to >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> think through how MM >>will >> >> >>work >> >> >> >> >>with >> >> >> >> >> > the new >> >> >> >> >> > > > > > > > consumer. >> >> >> >> >> > > > > > > > > >>>I >> >> >> >> >> > > > > > > > > >>> > >>mean >> >> >> >> >> > > > > > > > > >>> > >> >>this >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>isn't >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> very far off, maybe a >>few >> >> >> >>months >> >> >> >> >>if >> >> >> >> >> we >> >> >> >> >> > > > hustle? >> >> >> >> >> > > > > > I >> >> >> >> >> > > > > > > > > >>>could >> >> >> >> >> > > > > > > > > >>> > >> >>imagine us >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>getting >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> this mm fix done maybe >> >> >>sooner, >> >> >> >> >>maybe >> >> >> >> >> > in a >> >> >> >> >> > > > > > month? >> >> >> >> >> > > > > > > > So I >> >> >> >> >> > > > > > > > > >>> guess >> >> >> >> >> > > > > > > > > >>> > >> >>this >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>buys >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>us an >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> extra month before we >> >>rip it >> >> >> >>out >> >> >> >> >>and >> >> >> >> >> > throw >> >> >> >> >> > > > it >> >> >> >> >> > > > > > away? >> >> >> >> >> > > > > > > > > >>>Maybe >> >> >> >> >> > > > > > > > > >>> > >>two? >> >> >> >> >> > > > > > > > > >>> > >> >> >>This >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>bug >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>has >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> been there for a >>while, >> >> >>though, >> >> >> >> >> right? >> >> >> >> >> > Is >> >> >> >> >> > > > it >> >> >> >> >> > > > > > worth >> >> >> >> >> > > > > > > > > >>>it? >> >> >> >> >> > > > > > > > > >>> > >> >>Probably >> >> >> >> >> > > > > > > > > >>> > >> >> >>it >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>is, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>but >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> it still kind of >>sucks to >> >> >>have >> >> >> >>the >> >> >> >> >> > > > duplicate >> >> >> >> >> > > > > > > > effort. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> So anyhow let's >> >>definitely >> >> >> >>think >> >> >> >> >> about >> >> >> >> >> > how >> >> >> >> >> > > > > > things >> >> >> >> >> > > > > > > > > >>>will >> >> >> >> >> > > > > > > > > >>> work >> >> >> >> >> > > > > > > > > >>> > >> >>with >> >> >> >> >> > > > > > > > > >>> > >> >> >>the >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>new >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> consumer. I think we >>can >> >> >> >>probably >> >> >> >> >> just >> >> >> >> >> > > > have N >> >> >> >> >> > > > > > > > > >>>threads, >> >> >> >> >> > > > > > > > > >>> each >> >> >> >> >> > > > > > > > > >>> > >> >> >>thread >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>has >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>a >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> producer and consumer >> >>and is >> >> >> >> >> internally >> >> >> >> >> > > > single >> >> >> >> >> > > > > > > > > >>>threaded. >> >> >> >> >> > > > > > > > > >>> > >>Any >> >> >> >> >> > > > > > > > > >>> > >> >> >>reason >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>this >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> wouldn't work? >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> -Jay >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> On Wed, Jan 21, 2015 >>at >> >>5:29 >> >> >> >>PM, >> >> >> >> >> > Jiangjie >> >> >> >> >> > > > Qin >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >>>>><j...@linkedin.com.invalid> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> wrote: >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > Hi Jay, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > Thanks for comments. >> >> >>Please >> >> >> >>see >> >> >> >> >> > inline >> >> >> >> >> > > > > > responses. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > Jiangjie (Becket) >>Qin >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > On 1/21/15, 1:33 PM, >> >>"Jay >> >> >> >>Kreps" >> >> >> >> >> > > > > > > > > >>><jay.kr...@gmail.com> >> >> >> >> >> > > > > > > > > >>> > >> >>wrote: >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >Hey guys, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >A couple >> >> >>questions/comments: >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >1. The callback and >> >> >> >> >> user-controlled >> >> >> >> >> > > > commit >> >> >> >> >> > > > > > > > offset >> >> >> >> >> > > > > > > > > >>> > >> >> >>functionality >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>is >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> already >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >in the new consumer >> >> >>which we >> >> >> >> >>are >> >> >> >> >> > > > working on >> >> >> >> >> > > > > > in >> >> >> >> >> > > > > > > > > >>> parallel. >> >> >> >> >> > > > > > > > > >>> > >> >>If we >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> accelerated >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >that work it might >> >>help >> >> >> >> >> concentrate >> >> >> >> >> > > > > > efforts. I >> >> >> >> >> > > > > > > > > >>>admit >> >> >> >> >> > > > > > > > > >>> > >>this >> >> >> >> >> > > > > > > > > >>> > >> >> >>might >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>take >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >slightly longer in >> >> >>calendar >> >> >> >> >>time >> >> >> >> >> but >> >> >> >> >> > > > could >> >> >> >> >> > > > > > still >> >> >> >> >> > > > > > > > > >>> > >>probably >> >> >> >> >> > > > > > > > > >>> > >> >>get >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>done >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>this >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >quarter. Have you >>guys >> >> >> >> >>considered >> >> >> >> >> > that >> >> >> >> >> > > > > > approach? >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > Yes, I totally agree >> >>that >> >> >> >> >>ideally >> >> >> >> >> we >> >> >> >> >> > > > should >> >> >> >> >> > > > > > put >> >> >> >> >> > > > > > > > > >>>efforts >> >> >> >> >> > > > > > > > > >>> > >>on >> >> >> >> >> > > > > > > > > >>> > >> >>new >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>consumer. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > The main reason for >> >>still >> >> >> >> >>working >> >> >> >> >> on >> >> >> >> >> > the >> >> >> >> >> > > > old >> >> >> >> >> > > > > > > > > >>>consumer >> >> >> >> >> > > > > > > > > >>> is >> >> >> >> >> > > > > > > > > >>> > >> >>that >> >> >> >> >> > > > > > > > > >>> > >> >> >>we >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>expect >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> it >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > would still be used >>in >> >> >> >>LinkedIn >> >> >> >> >>for >> >> >> >> >> > > > quite a >> >> >> >> >> > > > > > while >> >> >> >> >> > > > > > > > > >>> before >> >> >> >> >> > > > > > > > > >>> > >>the >> >> >> >> >> > > > > > > > > >>> > >> >> >>new >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>consumer >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > could be fully >>rolled >> >>out. >> >> >> >>And >> >> >> >> >>we >> >> >> >> >> > > > recently >> >> >> >> >> > > > > > > > > >>>suffering a >> >> >> >> >> > > > > > > > > >>> > >>lot >> >> >> >> >> > > > > > > > > >>> > >> >>from >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>mirror >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > maker data loss >>issue. >> >>So >> >> >>our >> >> >> >> >> current >> >> >> >> >> > > > plan is >> >> >> >> >> > > > > > > > > >>>making >> >> >> >> >> > > > > > > > > >>> > >> >>necessary >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>changes to >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > make current mirror >> >>maker >> >> >> >> >>stable in >> >> >> >> >> > > > > > production. >> >> >> >> >> > > > > > > > > >>>Then we >> >> >> >> >> > > > > > > > > >>> > >>can >> >> >> >> >> > > > > > > > > >>> > >> >> >>test >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>and >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > rollout new consumer >> >> >> >>gradually >> >> >> >> >> > without >> >> >> >> >> > > > > > getting >> >> >> >> >> > > > > > > > > >>>burnt. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >2. I think >> >>partitioning >> >> >>on >> >> >> >>the >> >> >> >> >> hash >> >> >> >> >> > of >> >> >> >> >> > > > the >> >> >> >> >> > > > > > topic >> >> >> >> >> > > > > > > > > >>> > >>partition >> >> >> >> >> > > > > > > > > >>> > >> >>is >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>not a >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>very >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >good idea because >>that >> >> >>will >> >> >> >> >>make >> >> >> >> >> the >> >> >> >> >> > > > case of >> >> >> >> >> > > > > > > > going >> >> >> >> >> > > > > > > > > >>> from >> >> >> >> >> > > > > > > > > >>> > >>a >> >> >> >> >> > > > > > > > > >>> > >> >> >>cluster >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>with >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >fewer partitions to >> >>one >> >> >>with >> >> >> >> >>more >> >> >> >> >> > > > > > partitions not >> >> >> >> >> > > > > > > > > >>> work. I >> >> >> >> >> > > > > > > > > >>> > >> >> >>think an >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >intuitive >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >way to do this >>would >> >>be >> >> >>the >> >> >> >> >> > following: >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >a. Default >>behavior: >> >> >>Just do >> >> >> >> >>what >> >> >> >> >> > the >> >> >> >> >> > > > > > producer >> >> >> >> >> > > > > > > > > >>>does. >> >> >> >> >> > > > > > > > > >>> > >>I.e. >> >> >> >> >> > > > > > > > > >>> > >> >>if >> >> >> >> >> > > > > > > > > >>> > >> >> >>you >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> specify a >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >key use it for >> >> >> >>partitioning, if >> >> >> >> >> not >> >> >> >> >> > just >> >> >> >> >> > > > > > > > partition >> >> >> >> >> > > > > > > > > >>>in >> >> >> >> >> > > > > > > > > >>> a >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>round-robin >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >fashion. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >b. Add a >> >> >> >>--preserve-partition >> >> >> >> >> option >> >> >> >> >> > > > that >> >> >> >> >> > > > > > will >> >> >> >> >> > > > > > > > > >>> > >>explicitly >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>inherent >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>the >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >partition from the >> >>source >> >> >> >> >> > irrespective >> >> >> >> >> > > > of >> >> >> >> >> > > > > > > > whether >> >> >> >> >> > > > > > > > > >>> there >> >> >> >> >> > > > > > > > > >>> > >>is >> >> >> >> >> > > > > > > > > >>> > >> >>a >> >> >> >> >> > > > > > > > > >>> > >> >> >>key >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>or >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> which >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >partition that key >> >>would >> >> >> >>hash >> >> >> >> >>to. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > Sorry that I did not >> >> >>explain >> >> >> >> >>this >> >> >> >> >> > clear >> >> >> >> >> > > > > > enough. >> >> >> >> >> > > > > > > > The >> >> >> >> >> > > > > > > > > >>> hash >> >> >> >> >> > > > > > > > > >>> > >>of >> >> >> >> >> > > > > > > > > >>> > >> >> >>topic >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > partition is only >>used >> >> >>when >> >> >> >> >>decide >> >> >> >> >> > which >> >> >> >> >> > > > > > mirror >> >> >> >> >> > > > > > > > > >>>maker >> >> >> >> >> > > > > > > > > >>> > >>data >> >> >> >> >> > > > > > > > > >>> > >> >> >>channel >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>queue >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > the consumer thread >> >>should >> >> >> >>put >> >> >> >> >> > message >> >> >> >> >> > > > into. >> >> >> >> >> > > > > > It >> >> >> >> >> > > > > > > > > >>>only >> >> >> >> >> > > > > > > > > >>> > >>tries >> >> >> >> >> > > > > > > > > >>> > >> >>to >> >> >> >> >> > > > > > > > > >>> > >> >> >>make >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>sure >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > the messages from >>the >> >>same >> >> >> >> >> partition >> >> >> >> >> > is >> >> >> >> >> > > > sent >> >> >> >> >> > > > > > by >> >> >> >> >> > > > > > > > the >> >> >> >> >> > > > > > > > > >>> same >> >> >> >> >> > > > > > > > > >>> > >> >> >>producer >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>thread >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > to guarantee the >> >>sending >> >> >> >>order. >> >> >> >> >> This >> >> >> >> >> > is >> >> >> >> >> > > > not >> >> >> >> >> > > > > > at >> >> >> >> >> > > > > > > > all >> >> >> >> >> > > > > > > > > >>> > >>related >> >> >> >> >> > > > > > > > > >>> > >> >>to >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>which >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > partition in target >> >> >>cluster >> >> >> >>the >> >> >> >> >> > messages >> >> >> >> >> > > > end >> >> >> >> >> > > > > > up. >> >> >> >> >> > > > > > > > > >>>That >> >> >> >> >> > > > > > > > > >>> is >> >> >> >> >> > > > > > > > > >>> > >> >>still >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>decided by >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > producer. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >3. You don't >>actually >> >> >>give >> >> >> >>the >> >> >> >> >> > > > > > > > > >>> ConsumerRebalanceListener >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>interface. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>What >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >is >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >that going to look >> >>like? >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > Good point! I should >> >>have >> >> >>put >> >> >> >> >>it in >> >> >> >> >> > the >> >> >> >> >> > > > > > wiki. I >> >> >> >> >> > > > > > > > > >>>just >> >> >> >> >> > > > > > > > > >>> > >>added >> >> >> >> >> > > > > > > > > >>> > >> >>it. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >4. What is >> >> >> >>MirrorMakerRecord? I >> >> >> >> >> > think >> >> >> >> >> > > > > > ideally >> >> >> >> >> > > > > > > > the >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >>>MirrorMakerMessageHandler >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >interface would >>take a >> >> >> >> >> > ConsumerRecord as >> >> >> >> >> > > > > > input >> >> >> >> >> > > > > > > > and >> >> >> >> >> > > > > > > > > >>> > >>return a >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >ProducerRecord, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >right? That would >> >>allow >> >> >>you >> >> >> >>to >> >> >> >> >> > > > transform the >> >> >> >> >> > > > > > > > key, >> >> >> >> >> > > > > > > > > >>> value, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>partition, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>or >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >destination >>topic... >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > MirrorMakerRecord is >> >> >> >>introduced >> >> >> >> >>in >> >> >> >> >> > > > > > KAFKA-1650, >> >> >> >> >> > > > > > > > > >>>which is >> >> >> >> >> > > > > > > > > >>> > >> >>exactly >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>the >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>same >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > as ConsumerRecord in >> >> >> >>KAFKA-1760. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > private[kafka] class >> >> >> >> >> > MirrorMakerRecord >> >> >> >> >> > > > (val >> >> >> >> >> > > > > > > > > >>> sourceTopic: >> >> >> >> >> > > > > > > > > >>> > >> >> >>String, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > val >>sourcePartition: >> >> >>Int, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > val sourceOffset: >> >>Long, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > val key: >>Array[Byte], >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > val value: >> >>Array[Byte]) >> >> >>{ >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > def size = >> >>value.length >> >> >>+ >> >> >> >>{if >> >> >> >> >> (key >> >> >> >> >> > == >> >> >> >> >> > > > > > null) 0 >> >> >> >> >> > > > > > > > > >>>else >> >> >> >> >> > > > > > > > > >>> > >> >> >>key.length} >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > } >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > However, because >>source >> >> >> >> >>partition >> >> >> >> >> and >> >> >> >> >> > > > offset >> >> >> >> >> > > > > > is >> >> >> >> >> > > > > > > > > >>>needed >> >> >> >> >> > > > > > > > > >>> in >> >> >> >> >> > > > > > > > > >>> > >> >> >>producer >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>thread >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > for consumer offsets >> >> >> >> >>bookkeeping, >> >> >> >> >> the >> >> >> >> >> > > > record >> >> >> >> >> > > > > > > > > >>>returned >> >> >> >> >> > > > > > > > > >>> by >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >>MirrorMakerMessageHandler >> >> >> >>needs >> >> >> >> >>to >> >> >> >> >> > > > contain >> >> >> >> >> > > > > > those >> >> >> >> >> > > > > > > > > >>> > >> >>information. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>Therefore >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > ProducerRecord does >>not >> >> >>work >> >> >> >> >>here. >> >> >> >> >> We >> >> >> >> >> > > > could >> >> >> >> >> > > > > > > > > >>>probably >> >> >> >> >> > > > > > > > > >>> let >> >> >> >> >> > > > > > > > > >>> > >> >> >>message >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>handler >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > take ConsumerRecord >>for >> >> >>both >> >> >> >> >>input >> >> >> >> >> > and >> >> >> >> >> > > > > > output. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >5. Have you guys >> >>thought >> >> >> >>about >> >> >> >> >> what >> >> >> >> >> > the >> >> >> >> >> > > > > > > > > >>>implementation >> >> >> >> >> > > > > > > > > >>> > >>will >> >> >> >> >> > > > > > > > > >>> > >> >> >>look >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>like in >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >terms of threading >> >> >> >>architecture >> >> >> >> >> etc >> >> >> >> >> > with >> >> >> >> >> > > > > > the new >> >> >> >> >> > > > > > > > > >>> > >>consumer? >> >> >> >> >> > > > > > > > > >>> > >> >> >>That >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>will >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>be >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >soon so even if we >> >>aren't >> >> >> >> >>starting >> >> >> >> >> > with >> >> >> >> >> > > > that >> >> >> >> >> > > > > > > > let's >> >> >> >> >> > > > > > > > > >>> make >> >> >> >> >> > > > > > > > > >>> > >> >>sure >> >> >> >> >> > > > > > > > > >>> > >> >> >>we >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>can >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>get >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >rid >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >of a lot of the >> >>current >> >> >> >>mirror >> >> >> >> >> maker >> >> >> >> >> > > > > > accidental >> >> >> >> >> > > > > > > > > >>> > >>complexity >> >> >> >> >> > > > > > > > > >>> > >> >>in >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>terms >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>of >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >threads and queues >> >>when >> >> >>we >> >> >> >> >>move to >> >> >> >> >> > that. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > I haven¹t thought >> >>about it >> >> >> >> >> > throughly. The >> >> >> >> >> > > > > > quick >> >> >> >> >> > > > > > > > > >>>idea is >> >> >> >> >> > > > > > > > > >>> > >> >>after >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>migration >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> to >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > the new consumer, >>it is >> >> >> >>probably >> >> >> >> >> > better >> >> >> >> >> > > > to >> >> >> >> >> > > > > > use a >> >> >> >> >> > > > > > > > > >>>single >> >> >> >> >> > > > > > > > > >>> > >> >> >>consumer >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>thread. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > If multithread is >> >>needed, >> >> >> >> >> decoupling >> >> >> >> >> > > > > > consumption >> >> >> >> >> > > > > > > > > >>>and >> >> >> >> >> > > > > > > > > >>> > >> >>processing >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>might >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>be >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > used. MirrorMaker >> >> >>definitely >> >> >> >> >>needs >> >> >> >> >> > to be >> >> >> >> >> > > > > > changed >> >> >> >> >> > > > > > > > > >>>after >> >> >> >> >> > > > > > > > > >>> > >>new >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>consumer >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>get >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > checked in. I¹ll >> >>document >> >> >>the >> >> >> >> >> changes >> >> >> >> >> > > > and can >> >> >> >> >> > > > > > > > > >>>submit >> >> >> >> >> > > > > > > > > >>> > >>follow >> >> >> >> >> > > > > > > > > >>> > >> >>up >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>patches >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > after the new >>consumer >> >>is >> >> >> >> >> available. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >-Jay >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >On Tue, Jan 20, >>2015 >> >>at >> >> >>4:31 >> >> >> >> >>PM, >> >> >> >> >> > > > Jiangjie >> >> >> >> >> > > > > > Qin >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>>><j...@linkedin.com.invalid >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >wrote: >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> Hi Kafka Devs, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> We are working on >> >>Kafka >> >> >> >> >>Mirror >> >> >> >> >> > Maker >> >> >> >> >> > > > > > > > > >>>enhancement. A >> >> >> >> >> > > > > > > > > >>> > >>KIP >> >> >> >> >> > > > > > > > > >>> > >> >>is >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>posted >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>to >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> document and >> >>discuss on >> >> >> >>the >> >> >> >> >> > > > followings: >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> 1. KAFKA-1650: No >> >>Data >> >> >> >>loss >> >> >> >> >> mirror >> >> >> >> >> > > > maker >> >> >> >> >> > > > > > > > change >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> 2. KAFKA-1839: To >> >>allow >> >> >> >> >> partition >> >> >> >> >> > > > aware >> >> >> >> >> > > > > > > > mirror. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> 3. KAFKA-1840: To >> >>allow >> >> >> >> >>message >> >> >> >> >> > > > > > > > filtering/format >> >> >> >> >> > > > > > > > > >>> > >> >>conversion >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> Feedbacks are >> >>welcome. >> >> >> >>Please >> >> >> >> >> let >> >> >> >> >> > us >> >> >> >> >> > > > know >> >> >> >> >> > > > > > if >> >> >> >> >> > > > > > > > you >> >> >> >> >> > > > > > > > > >>> have >> >> >> >> >> > > > > > > > > >>> > >>any >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>questions or >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> concerns. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> Thanks. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> Jiangjie (Becket) >> >>Qin >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>-- >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>Thanks, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>Neha >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> > > > > > > > > >>> > > >> >> >> >> >> > > > > > > > > >>> > > >> >> >> >> >> > > > > > > > > >>> > >-- >> >> >> >> >> > > > > > > > > >>> > >Thanks, >> >> >> >> >> > > > > > > > > >>> > >Neha >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> > > > > > > > > >>> >> >> >> >> >> > > > > > > > > >>> >> >> >> >> >> > > > > > > > > >>> -- >> >> >> >> >> > > > > > > > > >>> Thanks, >> >> >> >> >> > > > > > > > > >>> Neha >> >> >> >> >> > > > > > > > > >>> >> >> >> >> >> > > > > > > > > > >> >> >> >> >> > > > > > > > > >> >> >> >> >> > > > > > > > >> >> >> >> >> > > > > > > > >> >> >> >> >> > > > > > > >> >> >> >> >> > > > > > > >> >> >> >> >> > > > > > > -- >> >> >> >> >> > > > > > > Thanks, >> >> >> >> >> > > > > > > Neha >> >> >> >> >> > > > > > >> >> >> >> >> > > > > > >> >> >> >> >> > > > >> >> >> >> >> > > > >> >> >> >> >> > >> >> >> >> >> > >> >> >> >> >> >> >> >> >> > >> >> >> >> > >> >> >> >> > >> >> >> >> >-- >> >> >> >> >Thanks, >> >> >> >> >Neha >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> > >> > >> >-- >> >Thanks, >> >Neha >> >>