KAFKA-1650 added this option, but we don’t have ―no.data.loss in any
official release. 

On 2/26/15, 12:01 PM, "Gwen Shapira" <gshap...@cloudera.com> wrote:

>Did --no.data.loss exist in previous releases of irrorMaker?
>If it does, maybe we want to keep it around for backward compatibility?
>(i.e. so existing dployments won't break)?
>
>Gwen
>
>On Thu, Feb 26, 2015 at 11:57 AM, Jiangjie Qin <j...@linkedin.com.invalid>
>wrote:
>
>> Hi Neha,
>>
>> Thanks for the comment. That’s a really good point.
>>
>> Originally I’m thinking about allowing user to tweak some parameter as
>> needed.
>> For example, some user might want to hae pipeline enabled and can
>> tolerate reordering, some user might want to use acks=1 or acks=0, some
>> might want to move forward when error is encountered in callback.
>> So we don’t want to enforce all the settings of no.data.loss. Meanwhile
>>we
>> want to make the life easier for the users who want no data loss so they
>> don’t need to set the configs one by one, therefore w created this
>>option.
>>
>> But as you suggested, we can probably make no.data.loss settings to be
>> default and removed the ―no.data.loss option, so if people want to tweak
>> the settngs, they can just change them, otherwise they get the default
>> no-data-loss settings.
>>
>> I’ll modify the KIP.
>>
>> Thanks.
>>
>> Jiangjie (Becket) Qin
>>
>> On 2/26/15, 8:58 AM, "Neha Narkhede" <n...@confluent.io> wrote:
>>
>> >Hey Becket,
>> >
>> >The KIP proposes addition of a --no.data.loss command line option to
>>the
>> >MirrorMaker. Though when would te user not want that option? I'm
>> >wondering
>> >what the benefit of providing that option is if every user would want
>>that
>> >for correct mirroring behavior.
>> >
>> >Other than that, the KIP looks great!
>> >
>> >Thanks,
>> >Neha
>> >
>> >On Wed, Feb 25, 2015 at 3:56 PM, Jiangjie Qin
>><j...@linkedin.com.invalid>
>> >wrote:
>> >
>> >> For 1), the current design allow you to do it. The customizable
>>message
>> >> handler takes in a ConsumerRecord and spit a List<ProducerRecord>,
>>you
>> >>can
>> >> just put a topic for the ProducerRecord different from
>>ConsumerRecord.
>> >>
>> >> WRT performance, we did some test in LinkedIn, the performance looks
>> >>good
>> >> to us.
>> >>
>> >> Jiangjie (Becket) Qin
>> >>
>> >> On 2/25/15, 3:41 PM, "Bhavesh Mistry" <mistry.p.bhav...@gmail.com>
>> >>wrote:
>> >>
>> >> >Hi Jiangjie,
>> >> >
>> >> >It might be too late.  But, I wanted to bring-up following use case
>>for
>> >> >adopting new MM:
>> >> >
>> >> >1) Ability to publish messge from src topic to different
>>destination
>> >> >topic
>> >> >via --overidenTopics=srcTopic:newDestinationTopic
>> >> >
>> >> >In order to adopt, new MM enhancement customer will compare
>> >>performance of
>> >> >new MM and data quality while running  old MM against same
>>destination
>> >> >cluster in Prd.
>> >> >
>> >> >Let me know if you agree to that or not.  Also, If yes, will be
>>able to
>> >> >able to provide this feature in release version.
>> >> >
>> >> >Thanks,
>> >> >
>> >> >Bhavesh
>> >> >
>> >> >
>> >> >On Tue, Feb 24, 2015 at 5:31 PM, Jiangjie Qin
>> >><j...@linkedin.com.invalid>
>> >> >wrote:
>> >> >
>> >> >> Sure! Just created the voting thread :)
>> >> >>
>> >> >> On 2/24/5, 4:44 PM, "Jay Kreps" <j...@confluent.io> wrote:
>> >> >>
>> >> >> >Hey Jiangjie,
>> >> >> >
>> >> >> >Let's do an official vote so that we kow what we are voting on
>>and
>> >>we
>> >> >>are
>> >> >> >crisp on what the outcome was. This thread is very long :-
>> >> >> >
>> >> >> >-Jay
>> >> >> >
>> >> >> >On Tue, Feb 24, 2015 at 2:53 PM, Jiangjie Qin
>> >> >><j...@linkedin.com.invalid>
>> >> >> >wrote:
>> >> >> >
>> >> >> >> I updated the KIP page based on the discussion we had.
>> >> >> >>
>> >> >> >> Should I launch another vote or we can think of this mail
>>thread
>> >>has
>> >> >> >> already included a vote?
>> >> >> >>
>> >> >> >> Jiangjie (Becket) Qin
>> >> >> >>
>> >> >> >> On 2/11/15, 5:15 PM, "Neha Nakhede" <n...@confuent.io> wrote:
> >> >> >>
>> >> >> >> >Thanks for the explanation, Joel! Would love to see the
>>results
>> >>of
>> >> >>the
>> >> >> >> >throughput experiment and I'm a +1 on everything els, ncluding
>> >>the
>> >> >> >> >rebalance callback and record handler.
>> >> >> >> >
>> >> >> >> >-Neha
>> >> >> >> >
>> >> >> >> >On Wed, Feb 11, 2015 at 1:13 PM Jay Kreps
>><jay.kr...@gail.com>
>> >> >>wrote:
>> >> >> >> >
>> >> >> >> >> Cool, I agree with all that.
>> >> >> >> >>
>> >> >> >> >> I agree about the need for a rebalancing callback.
>> >> >> >> >>
>> >> >> >> >> Totally agree about record handler.
>> >> > >> >>
>> >> >> >> >> It would be great to see if a prototype of this is workable.
>> >> >> >> >>
>> >> >> >> >> Thanks guys!
>> >> >> >> >>
>> >> >> >> >> -Jay
>> >> >> >> >>
>> >> >> >> >> On Wed, Feb 11 2015 at 12:36 PM, Joel Koshy
>> >><jjkosh...@gmail.com
>> >> >
>> >> >> >> >>wrote:
>> >> >> >> >>
>> >> >> >> >> > Hey Jay,
>> >> >> >> >> >
>> >> >> >> >> > Guozhang, Becket and I got togethe to discus this and we
>> >> >>think:
>> >> >> >> >> >
>> >> >> >> >> > - It seems that your proposal based on the new consumr and
>> >>flush
>> >> >> >>call
>> >> >> >> >> >   should work.
>> >> >> >> >> > - We would likely need to call the poll with a timeout
>>that
>> >> >>matches
>> >> >> >> >> >   the offset ommit interval in ordr to deal with low
>>volume
>> >> >> >> >> >   mirroring pipelines.
>> >> >> >> >> > - We will still need a reblnce callbackto reduce
>> >>duplicates -
>> >> >> >>the
>> >> >> >> >> >   rebalance callback would need to flush and ommit
>>offsets.
>> >> >> >> >> > - The only remaining question is if the overall
>>throughput is
>> >> >> >> >> >   sufficient. I think someone at LinkedIn (I don't
>>remember
>> >>who)
>> >> >> >>did
>> >> >> >> >> >   some experimens with data channel size == 1 and ran
>>into
>> >> >>issues.
>> >> >> >> >> >   That was not thoroughly investigated though.
>> >> >> >> >> > - The addition of flush may actually make this solution
>> >>viable
>> >> >>for
>> >> >> >>the
>> >> >> >> >> >   current mirror-maker (wih the old consumer). We can
>> >>prototype
>> >> >> >>that
>> >> >> >> >> >   offline and if it works out well we can redo KAFKA-1650
>> >>(i.e.,
>> >> >> >> >> >   refactor the current mirror make). The flush call and
>>the
>> >>new
>> >> >> >> >> >   consumer didn't exist at the time we did KAFKA-1650 so
>>this
>> >> >>did
>> >> >> >>not
>> >> >> >> >> >   occur to us.
>> >> >> >> >> > - We think the RecordHandler is still a useful small
>>addition
>> >> >>for
>> >> >> >>the
>> >> >> >> >> >   use-cases mentioned earlier in this thread.
>> >> >> >> >> >
>> >> >> >> >> > Thanks,
>> >> >> >> >> >
>> >> >> >> >> > Joel
>> >> >> >> >> >
>> >> >> >> >> > On Wed, Feb 11, 2015 at 09:05:39AM -0800, Jay Kreps wrote:
>> >> >> >> >> > > Guozhang, I agree with 1-3, I do think what I was
>>proposing
>> >> >>was
>> >> >> >> >>simpler
>> >> >> >> >> > but
>> >> >> >> >> > > perhaps there re gaps in that?
>> >> >> >> >> > >
>> >> >> >> >> > > Hey Joel--Here was a sketch of what I was proposing. I
>>do
>> >> >>think
>> >> >> >>this
>> >> >> >> >> > get's
>> >> >> >> >> > > rid of manual offset tracking, espcially doing so across
>> >> >>threads
>> >> >> >> >>with
>> >> >> >> >> > > dedicated commit threads, which I think is prety
>>complex.
>> >> >> >> >> > >
>> >>>> >> >> > > while(true) {
>> >> >> >> >> > >     val recs = consumer.poll(Long.MaxValue);
>> >> >> >> >> > >     for (rec <- recs)
>> >> >> >> >> > >        producer.sd(rec, logErrorCallback)
>> >> >> >> >> > >     if(System.currentTimeMillis - lastCommit >
>> >> >>commitInterval) {
>> >> >> >> >> > >         producer.flush()
>> >> >> >> >> > >         consumer.commit()
>> >> >> >> >> > >         lastCommit = System.currentTimeMillis
>> >> >> >> >> > >     }
>> >> >> >> >> > > }
>> >> >> >> >> > >
>> >> >> >> >> > > (See the prevous email for details). I think the
>>question
>> >> >>is: is
>> >> >> >> >>there
>> >> >> >> >> > any
>> >> >> >> >> > > reason--performance, correctness, etc--that this won't
>> >>work?
>> >> >> >> >>Basically
>> >> >> >> >> I
>> >> >> >> >> > > think you guys have thought about this more so I may be
>> >> >>missing
>> >> >> > >> > something.
>> >> >> >> > > > If so let's flag it while we still have leeway on the
>> >> >>consumer.
>> >> >> >> >> > >
>> >> >> >> >> > > If we think that will work, well I do think it is
>> > >>conceptually a
>> >> >> >>lot
>> >> >> >> >> > > simpler than the current code, though I suppose one
>>could
>> >> >> >>disagree
>> >> >> >> >>on
>> >> >> >> >> > that.
>> >> >> >> >> > >
>> >> >> >> >> > > -Jay
>> >> >> >> >> > >
>> >> >> >> >> > > On Wed, Feb 11, 2015 at 5:53 AM, Joel Koshy
>> >> >><jjkosh...@gmail.com
>> >> >> >
>> >> >> >> >> wrote:
>> >> >> >> >> > >
>> >> >> > >> > > > Hi Jay,
>> >> >> >> >> > > >
>> >> >> >> >> > > > > The data channels are actually a big part of the
>> >> >>complexity
>> >> >> >>of
>> >> >> >> >>the
>> >> >> >> >> > zero
>> >> >> >> >> > > > > data loss design, though, right? Because then you
>>need
>> >> >>ome
>> >> >> >> >>reverse
>> >> >> >> >> > > > channel
>> >> >> >> >> > > > > to flo the acks back to the consumer based on where
>>you
>> >> >>are
>> >> >> >> >>versus
>> >> >> >> >> > just
>> >> >> >> >> > > > > acking what you have read and written (as in the
>>code
>> >> >> >>snippet I
>> >> >> >> >>put
>> >> >> >> >> > up).
>> >> >> >> >> >> >
>> >> >> >> >> > > > I'm not sure if we are on the same page. Even if the
>>data
>> >> >> >>channel
>> >> >> >> >>was
>> >> >> >> >> > > > not there the current handling fr zero data loss would
>> >> >>remain
>> >> >> >> >>very
>> >> > >> >> > > > similar - you would need to maintain lists of unacked
>> >>source
>> >> >> >> >>offsets.
>> >> >> >> >> > > > I'm wondering if the KIP needs more detail on how it
>>is
>> >> >> >>currenly
>> >> >> >> >> > > > implemented; or are suggesting a different approach
>>(in
>> >> >>which
>> >> >> >> >>case I
>> >> >> >> >> > > > have notfully understood). I'm not sure whatyou mean
>>by
>> >> >> >>flowing
>> >> >> >> >> acks
>> >> >> >> >> > > > back to the consumer - the MM commits offsets after
>>the
>> >> >> >>producer
>> >> >> >> >>ack
>> >> >> >> >> > > > has been received. There is some additional complexity
>> >> >> >>introduced
>> > >> >> >>in
>> >> >> >> >> > > > reducing duplicates on a rebalance - this is actually
>> >> >>optional
>> >> >> >> >>(since
>> >> >> >> >> > > > duplicates are currently a given). The reson that was
>> >>done
>> >> >> >> >>anyway is
>> >> >> >> >> > > > that with the auto-commit turned off duplicates are
>> >>almost
>> >> >> >> >>guaranteed
>> >> >> >> >> > > > on a rebalance.
>> >> >> >> >> > > >
>> >> >> >> >> > > > > I think the point that Neha and I were trying to
>>make
>> >>was
>> >> >> >>that
>> >> >> >> >>the
>> >> >> >> >> > > > > motivation to embed stuff into MM kindof is related
>>to
>> >> >>how
>> >> >> >> >> complex a
>> >> >> >> >> > > > > simple "consume and prouce" with good throughput
>>will
>> >> >>be. If
>> >> >> >> >>it is
>> >> >> >> >> > > > simple
>> >> >> >> >> > > > > to write such a thing in a few lines, the pain of
>> >> >>embedding a
>> >> >> >> >>bunch
>> >> >> >> >> > of
>> >> >> >> >> > > > > stuff won't be worth it, if it has to be as complex
>>as
>> >>the
>> >> >> >> >>current
>> >> >> >> >> mm
>> >> >> >> >> > > > then
>> >> >> >> >> > > > > of course we will need all kinds of plug ins
>>because no
>> >> >>one
>> >> >> >> >>will be
>> >> >> >> > > able
>> >> >> >> >> > > > to
>> >> >> >> >> > > > > write such a thing. I don't have a huge concern
>>with a
>> >> >>simple
>> >> >> >> >> plug-in
>> >> >> >> >> > > > but I
>> >> >> >> >> > > > > think if i turns into something more complex with
>> >> >>filtering
>> >> >> >>and
>> >> >> >> >> > > > > aggregation or whatever we really need to stop and
>> >>think a
>> >> >> >>bit
>> >> >> >> >> about
>> >> >> >> >> > the
>> >> >> > >> > > > > design.
>> >> >> >> >> > > >
>> >> >> >> >> > > > I agree - I don't think there is a usecase for any
>> >>comple
>> >> >> >> >>plug-in.
>> >> >> >> >> > > > It is pretty much what Becket has described curently
>>for
>> >> >>the
>> >> >> >> >message
>> >> >> >> >> > > > handler - i.e., take an incoming record and return a
>> >>list of
>> >> >> >> >>outgoing
>> >> >> >> >> > > > records (which could be empty if you filter).
>> >> >> >> >> > > >
>> >> >> >> >> > > > So here is my ake on the MM:
>> >> >> >> >> > > > - Bare bones: simple consumer - producer pairs (07
>> >>style).
>> >> >> >>This
>> >> >> >> >>is
>> >> >> >> >> > > >   ideal, but does not handle no data los
>> >> >> >> >> > > > - Above plus support no data loss. This actually adds
>> >>quite
>> >> >>a
>> >> >> >>bit
>> >> >> >> >>of
>> >> >> >> >> > > >   complexity.
>> >> >> >> >> > > > - Above plus the message handler. This is a trivial
>> >> >>addition I
>> >> >> >> >>think
>> >> >> >> >> > > >   that makes the MM usable in a few other
>>mirroring-like
>> >> >> >> >> applications.
>> >> >> >> >> > > >
>> >> >> >> >> > > > Joel
>> >> >> >> >> > > >
>> >> >> >> >> > > > > On Tue, Feb 10, 2015 at 12:31 PM, Joel Koshy
>> >> >> >> >><jjkosh...@gmail.com>
>> >> >> >> >> > > > wrote:
>> >> >> >> >> > > > >
>> >> >> >> >> > > > > >
>> >> >> >> >> > > > > >
>> >> >> >> >> > > > >> On Tue, Feb 10, 2015 at 12:13:46PM -0800, Neha
>> >>Narkhede
>> >> >> >>wrote:
>> >> >> >> >> > > > > > > I think all of us agree that we want to design
>> >> >> >>MirrorMaker
>> >> >> >> >>for
>> >> >> >> >> 0
>> >> >> >> >> > data
>> >> >> >> >> > > > > > loss.
>> >> >> >> >> > > > > > > With the absence of the data channel, 0 data
>>loss
>> >> >>will be
>> >> >> >> >>much
>> >> >> >> >> > > > simpler to
>> >> >> >> >> > > > > > > implement.
>> >> >> >> >> > > > > >
>> > >> >> >> > > > > > The data channel is irrelevant to theimplementation
>> >>of
>> >> >> >>zero
>> >> >> >> >>data
>> >> >> >> >> > > > > > loss. The complexity in the implementation of no
>>data
>> >> >>loss
>> >> >> >> >>that
>> >> >> >> >> you
>> >> >> >> >> > > > > > are seeing in mirror-maker affects all
>> >> >>consume-then-produce
>> >> >> >> >> > patterns
>> >> >> >> >> > > > > > whether or not there is a data hannel.  You still
>> >>need
>> >> >>to
>> >> >> >> >>  maintain a
>> >> >> >> >> > > > > > list of unacked offsets. What I meant earlier is
>> >>that we
>> >> >> >>can
>> >> >> >> >> > > > > > brainstorm completely different approaches to
>> >> >>supporting no
>> >> >> >> >>data
>> >> >> >> >> > loss,
>> >> >> >> >> > > > > > but the current implementation is the only
>>solution
>> >>we
>> >> >>ar
>> >> >> >> >>aware
>> >> >> >> >> > of.
>> >> >> >> >> > > > > >
>> >> >> >> >> > > > > > >
>> >> >> >> >> > > > > > > My arguments for adding a message handler are
>>that:
>> >> >> >> >> > > > > > > > 1. It is more efficient to do something in
>>common
>> >> >>for
>> >> >> >>all
>> >> >> >> >>the
>> >> >> >> >> > > > clients
>> >> >> >> >> > > > > > in
>> >> >> >> >> > > > > > > > pipeline than letting each client do thesame
>> >>thing
>> >> >>for
>> >> >> >> >>many
>> >> >> >> >> > > > times. And
>> >> >> >> >> > > > > > > > there are concrete use cases for the message
>> >>handler
>> >> >> >> >>already.
>> >> >> >> >> > > > > > > >
>> >> >> >>>> > > > > > >
>> >> >> >> >> > > > > > > What are the concrete use cases?
>> >> >> >> >> > > > > >
>> >> >> >> >> > > > > > I think Becket alrady described a couple of use
>> >>cases
>> >> >> >> >>earlier in
>> >> >> >> >> > the
>> >> >> >> >> > > > > > thread.
>> >> >> >> >> > > > > >
>> >> >> >> >> > > > > > <quote>
>> >> >> >> >> > > > > >
>> >> >> >> >> > > > >> 1. Format conversion. We have a use case where
>> >>clients
>> >> >>of
>> >> >> >> >>source
>> >> >> >> >> > > > > > cluster
>> >> >> >> >> > > > > > use an internal schema and clients of target
>>cluster
>> >> >>use a
>> >> >> >> >> > different
>> >> >> >> >> > > > > > public schema.
>> >> >> >> >> > > > > > 2. Message filtering: For the messges published
>>to
>> >> >>source
>> >> >> >> >> cluster,
>> >> >> >> >> > > > > > there
>> >> >> >> >> > > > > > ar some messages private to source cluster clients
>> >>and
>> >> >> >>should
>> >> >> >> >> not
>> >> >> >> >> > > > > > exposed
>> >> >> >> >> > > > > > to target cluster clients. It would be difficult
>>to
>> >> >>publish
>> >> >> >> >>those
>> >> >> >> >> > > > > > messages
>> >> >> >> >> > > > > > into different partitions because they need to be
>> >> >>ordered.
>> >> >> >> >> > > > > > I agree that we can always filter/convert messages
>> >>after
>> >> >> >>they
>> >> >> >> >>are
>> >> >> >> >> > > > > > copied
>> >> >> >> >> > > > > > to thetarget cluster, but that costs network
>> >>bandwidth
>> >> >> >> >> > unnecessarily,
>> > >> >> >> > > > > > especially if that is a cross colo mirror. With the
>> >> >> >>handler,
>> >> >> >> >>we
>> >> >> >> >> can
>> >> >> >> >> > > > > > co-locate the mirror maker with source cluster and
>> >>save
>> >> >> >>that
>> >> >> >> >> cost.
>> >> >> >> >> > > > > > Also,
>> >> >> >> >> > > > > > imagine there are many downstream consumers
>>consuming
>> >> >>from
>> >> >> >>the
>> >> >> >> >> > target
>> >> >> >> >> > > > > > cluster, filtering/reformatting the messages
>>before
>> >>the
>> >> >> >> >>messages
>> >> > >> >> > reach
>> > >> >> >> > > > > > te
>> >> >> >> >> > > > > > target cluster is much more efficient than having
>> >>each
>> >> >>of
>> >> >> >>the
>> >> >> >> >> > > > > > consumers do
>> >> >> >> >> > > > > > this individually on their own.
>> >> >> >> >> > > > > >
>> >> >> >> >> > > > > > </quote>
>> >> >> >> >> > > > > >
>> >> >> >> >> > > > > > >
>> >> >> >> >> > > > > > > Also the KIP still refers to he datachannel in a
>> >>few
>> >> >> >> >>places
>> >> >> >> >> > > > (Motivation
>> >> >> >> >> > > > > > > and "On consumer rebalance" sections). Can you
>> >>update
>> >> >>the
>> >> >> >> >>wiki
>> >> >> >> >> > so it
>> >> >> >> >> > > > is
>> >> >> >> >> > > > > > > easier to review the new design, specially the
>> >>data
>> >> >>loss
>> >> >> >> >>part.
>> >> >> >> >> > > > > > >
>> >> >> >> >> > > > > > >
>> >> >> >> > > > > > > > On Tue, Feb 10, 2015 at 10:36 AM, Joel Koshy <
>> >> >> >> >> > jjkosh...@gmail.com>
>> >> >> >> >> > > > > > wrote:
>> >> >> >> >> > > > > > >
>> >> >> >> >> > > > > > > > I think the message handler adds little to
>>no>>
>> >> >>complexity
>> >> >> >> >>to
>> >> >> >> >> the
>> >> >> >> >> > > > mirror
>> >> >> >> >> > > > > > > > maker. Jay/Neha, the MM became scary due to
>>the
>> >> >> >> >> rearchitecture
>> >> >> >> >> > we
>> >> >> >> >> > > > did
>> >> >> >> >> > > > > > > > for 0.8 due to performance issues compared
>>with
>> >>0.7
>> >> >>-
>> >> >> >>we
>> >> >> >> >> should
>> >> >> >> >> > > > remove
>> >> >> >> >> > > > > > > > the data channel if it can match the current
>> >> >> >>throughput. I
>> >> >> >> >> > agree
>> >> >> >> >> > > > it is
>> >> >> >> >> > > >  > > > worth prototyping and testing that so the MM
>> >> >> >>architecture
>> >> >> >> >>is
>> >> >> >> >> > > > > > > > simplified.
>> >> >> >> >> > > > > > >
>> >> >> >> >> > > > > > > > The MM became a little scarier in KAFKA-1650
>>in
>> >> >>order
>> > >> >>to
>> >> >> >> >> > support no
>> >> >> >> >> > > > > > > > data loss. I think the implementation for no
>>data
>> >> >>loss
>> >> >> >> >>will
>> >> >> >> >> > remain
>> >> >> >> >> > > > > > > > about the same even in the new model (even
>> >>without
>> >> >>the
>> >> >> >> >>data
>> >> >> >> >> > > > channel) -
>> >> >> >> >> > > > > > > > we can probably brainstorm more if there is a
>> >> >> >> >>better/simpler
>> >> >> >> >> > way
>> >> >> >> >> > > > to do
>> >> >> >> >> > > > > > > > it (maybe there is in the absence of the data
>> >> >>channel)
>> >> >> >> >>but at
>> >> >> >> >> > the
>> >> >> >> >> > > > time
>> >> >> >> >> > > > > > > > it was the best we (i.e., Becket, myself, Jun
>>and
>> >> >> >>Guozhang
>> >> >> >> >> who
>> >> >> >> >> > > > > > > > participated on the review) could come up
>>with.
>> >> >> >> >> > > > > > > >
>> >> >> >> >> > > > > > > > So I'm definitely +1 on whatever it takes to
>> >> >>support no
>> >> >> >> >>data
>> >> >> >>>> > lss.
>> >> >> >> >> > > > I
>> >> >> >> >> > > > > > > > think most people would want that out of the
>>box.
>> >> >> >> >> > > > > > > >
>> >> >> >> >> > > > > > > > As for the message handler, as Becket wrote
>>and I
>> >> >>agree
>> >> >> >> >>with,
>> >> >> >> >> > it is
>> >> >> >> >> > > > > > > > really a trivial addition that would benefit
>> >> >>(perhaps
>> >> >> >>not
>> >> >> >> >> most,
>> >> >> >> >> > > > but at
>> >> >> >> >> > > > > > > > least some). So I'm personally +1 on that as
>> >>well.
>> >> >>That
>> >> >> >> >>said,
>> >> >> >> >> > I'm
>> >> >> >> >> > > > also
>> >> >> >> >> > > > > > > > okay with it not being there. I think the MM
>>is
>> >> >>fairly
>> >> >> >> >> > stand-alone
>> >> >> >> >> > > > and
>> >> >> >> >> > > > > > > > simpe enough that it is entirely reasonable
>>and
>> >> >> >> >>absolutely
>> >> >> >> >> > > > feasible
>> >> >> >> >> > > > > > > > or companies to fork/re-implement the mirror
>> >>maker
>> >> >>for
>> >> >> >> >>their
>> >> >> >> >> > own
>> >> >> >> >> > > > > > > > needs.
>> >> >> >> >> > > > > > > >
>> >> >> >> >> > > > > > > > So in summary, I'm +1 on the KIP.
>> >> >> >> >> > > > > > > >
>> >> >> >> >> > > > > > > > Thanks,
>> >> >> >> >> > > > > > > >
>> >> >> >> >> > > > > > > > Joel
>> >> >> >> >> > > > > > > >
>> >> >> >> >> > > > > > > > On Mon, Feb 09, 2015 at 09:19:57PM +0000,
>> >>Jiangjie
>> >> >>Qin
>> >> >> >> >>wrote:
>> >> >> >> >> > > > > > > > > I just updated the KIP page and incorporated
>> >>Jay
>> >> >>and
>> >> >> >> >>Neha’s
>> >> >> >> >> > > > > > suggestion.
>> >> >> >> >> > > > > > > > As
>> >> >> >> >> > > > > > > > > a brief summay of where we are:
>> >> >> >> >> > > > > > > > >
>> >> >> >> >> > > > > > > > > Consensus reached:
>> >> >> >> >> > > > > > > > > Have N independent mirror maker threads each
>> >>has
>> >> >> >>their
>> >> >> >> >>own
>> >> >> >> >> > > > consumers
>> >> >> >> >> > > > > > but
>> >> >> >> >> > > > > > > > > share a producer. The mirror maker threads
>> >>will be
> >> >> >> >> > responsible
>> >> >> >> >> > > > for
>> >> >> >> >> > > > > > > > > decompression, compression and offset commit
>> >>No
>> >> >>data
>> >> >> >> >> > channel and
>> >> >> >> >> > > > > > > > separate
>> >> >> >> >> > > > > > > > > offset commit thread is needed. Consumer
>> >>rebalance
>> >> >> >> >>callback
>> >> >> >> >> > will
>> >> >> >> >> > > > be
>> >> >> >> >> > > > > > used
>> >> >> >> >> > > > > > > > > to avoid duplicates on rebalance.
>> >> >> >> >> > > > > > > > >
>> >> >>>> >> > > > > > > > > Still under discussion:
>> >> >> >> >> > > > > > > > > Whether message handler is needed.
>> >> >> >> >> > >> > > > > >
>> >> >> >> >> > > > > > > > > My arguments for adding a message handler
>>are
>> >> >>that:
>> >> >> >> >> > > > > > > > > 1. It is more efficient to do something in
>> >>common
>> >> >>for
>> >> >> >> >>all
>> >> >> >> >> the
>> >> >> >> >> > > > > > clients in
>> >> >> >> >> > > > > > > > > pipeline than letting each client do the
>>same
>> >> >>thing
>> >> >> >>for
>> >> >> >> >> many
>> >> >> >> >> > > > times
>> >> >> >> >> > > > > > And
>> >> >> >> >> > > > > > > > > there are concrete use cases for the message
>> >> >>handler
>> >> >> >> >> already.
>> >> >> >> >> > > > > > > > > 2. It is not a big complicated add-on to
>>mirror
>> >> >> >>maker.
>> >> >> >> >> > > > > > > > > 3.Without a message handler, for customers
>> >>needs
>> >> >>it,
>> >> >> >> >>they
>> >> >> >> >> > have
>> >> >> >> >> > > > to
>> >> >> >> >> > > > > > > > > re-implement all the logics of mirror maker
>>by
>> >> >> >> >>themselves
>> >> >> >> >> > just in
>> >> >> >> >> > > > > > order
>> >> >> >> >> > > > > > > > to
>> >> >> >> >> > > > > > > > > ad this handling in pipeline.
>> >> >> >> >> > > > > > > > >
>> >> >> >> >> > > > > > > > > Any thoughts?
>> >> >> >> >> > > > > > > > >
>> >> >> >> >> > > > > > > > > Thanks.
>> >> >> >> >> > > > > > > > >
>> >> >> >> >> > > > > > > > > ―Jiangjie (Becket) Qin
>> >> >> >> >> > > > > > > > >
>> >> >> >> >> > > > > > > > > On 2/8/15, 6:35 PM, "Jiangjie Qin"
>> >> >> >>j...@linkedin.com>
>> >> >> >> >> > wrote:
>> >> >> >> >> > > > > > > > >
>> >> >> >> >> > > > > > > > > >Hi Jay, thanks a lot for the comments.
>> >> >> >> >> > > > > > > > > >I think this solution is better. We
>>probably
>> >> >>don’t
>> >> >> >>need
>> >> >> >> >> data
>> >> >> >> >> > > > channel
>> >> >> >> >> > > > > > > > > >anymore. It canbe replaced with a list of
>> >> >>producer
> >> >> >>if
>> >> >> >> >>we
>> >> >> >> >> > need
>> >> >> >> >> > > > more
>> >> >> >> >> > > > > > > > sender
>> >> >> >> >> > > > > > > > > >thread.
>> >> >> >> >> > > > > > > > > I’ll update the KIP page.
>> >> >> >> >> > > > > > > > > >
>> >> >>>> >> > > > > > > > > >The reasoning about message handler is
>>mainly
>> >>for
>> >> >> >> >> efficiency
>> >> >> >> >> > > > > > purpose.
>> >> >> >> >> > > > > > > > I’m
>> >> >> >> >> > > > > > > > > >thinking that if something can be done in
>> >> >>pipeline
>> >> >> >>for
>> >> >> >> >>all
>> >> >> >> >> > the
>> >> >> >> >> > > > > > clients
>> >> >> >> >> > > > > > > > > >such as filtering/reformatting, it is
>>probably
>> >> >> >>better
>> >> >> >> >>to
>> >> >> >> >> do
>> >> >> >> >> > it
>> > >> >> >> > > > in
>> >> >> >> >> > > > > > the
>> >> >> >> >> > > > >  > > > >pipeline than asking 100 clients do the same
>> >> >>thing
>> >> >> >>for
>> >> >> >> >>100
>> >> >> >> >> > > > times.
>> >> >> >> >> > > > > > > > > >
>> >> >> >> >> > > > > > > > > >―Jiangjie (Becket) Qin
>> >> >> >> >> > > > > > > > > >
>> >> >> >> >> > > > > > > > > >
>> >> >> >> >> > > > > > > > > >On 2/8/15, 4:59 PM, "Jay Kreps"
>> >> >> >><jay.kr...@gmail.co>
>> >> >> >> >> > wrote:
>> >> >> >> >> > > > > > > > > >
>> >> >> >> >> > > > > > > > > >>Yeah, I second Neha's comments. The
>>current
>> >mm
>> >> >>code
>> >> >> >> >>has
>> >> >> >> >> > taken
>> >> >> >> >> > > > > > something
>> >> >> >> >> > > > > > > > > >>pretty simple and made it pretty scary
>>with
>> >> >> >>callbacs
>> >> >> >> >>and
>> >> >> >> >> > > > > > wait/notify
>> >> >> >> >> > > > > > > > > >>stuff. Do we believe this works? Ican't
>> >>tell by
>> >> >> >> >>looking
>> >> >> >> > > at it
>> >> >> >> >> > > > > > which is
>> >> >> >> >> > > > > > > > > >>kind of bad for something important like
>> >>this. I
>> >> >> >>don't
>> >> >> >> >> mean
>> >> >> >> >> > > > this as
>> >> >> >> >>> > > > > > > > >>criticism, I know the history: we added in
>> >> >>memory
>> >> >> >> >>ueues
>> >> >> >> >> to
>> >> >> >> >> > > > help
>> >> >> >> >> > > > > > with
>> >> >> >> >> > > > > > > > > >>other
>> >> >> >> >> > > > > > > > > >>performance problems without thinking
>>about
>> >> >> >> >>correctness,
>> >> >> >> >> > then
>> >> >> >> >> > > > we
>> >> >> >> >> > > > > > added
>> >> >> >> >> > > > > > > > > >>stuff to work around the in-memory queues
>>not
>> >> >>lose
>> >> >> >> >>data,
>> >> >> >> >> > and
>> >> >> >> >> > > > so on.
>> >> >> >> >> > > > > > > > > >>
>> >> >> >> >> > > > > > > > > >>Can we instead do the pposite exercise and
>> >> >>start
>> >> >> >>with
>> >> >> >> >> the
>> >> >> >> >> > > > basics
>> >> >> >> >> > > > > > of
>> >> >> >> >> > > > > > > > what
>> >> >> >> >> > > > > > > > > >>mm should do and think about what
>> >>deficiencies
>> >> >> >> >>prevents
>> >> >> >> >> > this
>> >> >> >> >> > > > > > approach
>> >> >> >> >> > > > > > > > > >>from
>> >> >> >> >> > > > > > > > > >>working? Then let's make sure the
>>currently
>> >> >> >>in-flight
>> >> >> >> >> work
>> >> >> >> >> > will
>> >> >> >> >> > > > > > remove
>> >> >> >> >> > > > > > > > > >>these deficiencies. After all mm is kind
>>of
>> >>the
>> >> >> >> >> > prototypical
>> >> >> >> >> > > > kafka
>> >> >> >> >> > > > > > use
>> >> >> >> >> > > > > > > > > >>case
>> >> >> >> >> > > > > > > > > >>so if we can't make our clients to this
>> >> >>probably no
>> >> >> >> >>one
>> >> >> >> >> > else
>> >> >> >> >> > > > can.
>> >> >> >> >> > > > > > > > > >>
>> >> >> >> >> > > > > > > > > >>I think mm should just be N independent
>> >>threads
>> >> >> >>each
>> >> >> >> >>of
>> >> >> >> >> > which
>> >> >> >> >> > > > has
>> >> >> >> >> > > > > > their
>> >> >> >> >> > > > > > > > > >>own
>> >> >> >> >> > > > > > > > > >>consumer but share a producer and each of
>> >>which
>> >> >> >>looks
>> >> >> >> >> like
>> >> >> >> >> > > > this:
>> >> >> >> >> > > > > > > > > >>
>> >> >> >> >> > > > > > > > > >>while(true) {
>> >> >> >> >> > > > > > > > > >>    val recs =
>>consumer.poll(Long.MaxValue);
>> >> >> >> >> > > > > > > > > >>    for (rec <- recs)
>> >> >> >> >> > > > > > > > > >>        producer.send(rec,
>>logErrorCallback)
>> >> >> >> >> > > > > > > > > >>    if(System.currentTimeMillis -
>>lastCommit
>> >>>
>> >> >> >> >> > commitInterval)
>> >> >> >> > > > > {
>> >> >> >> >> > > > > > > > > >>        producer.flush()
>> >> >> >> >> > > > > > > > > >>        consumer.commit()
>> >> >> >> >> > > > > > > > > >>        lastCommit =
>>System.currentTimeMillis
>> >> >> >> >> > > > > > > > > >>    }
>> >> >> >> >> > > > > > > > > >>}
>> >> >> >> >> > > > > > > > > >>
>> >> >> >> >> > > > > > > > > >>This will depend on setting the retry
>>count
>> >>in
>> >> >>the
>> >> >> >> >> > producer to
>> >> >> >> >> > > > > > > > something
>> >> >> >> >> > > > > > > > > >>high with a largish backoff so that a
>>failed
>> >> >>send
>> >> >> >> >>attempt
>> >> >> >> >> > > > doesn't
>> >> >> >> >> > > > > > drop
>> >> >> >> >> > > > > > > > > >>data.
>> >> >> >> >> > > > > > > > > >>
>> >> >> >> >> > > > > > > > > >>We will need to use the callback to force
>>a
>> >> >>flush
>> >> >> >>and
>> >> >> >> >> > offset
>> >> >> >> >> > > > > > commit on
>> >> >> >> >> > > > > > > > > >>rebalance.
>> >> >> >> >> > > > > > > > > >>
>> >> >> >> >> > > > > > > > > >>This approach may have a few more TCP
>> >> >>connections
>> >> >> >>due
>> >> >> >> >>to
>> >> >> >> >> > using
>> >> >> >> >> > > > > > multiple
>> >> >> >> >> > > > > > > > > >>consumers but I think it is a lot easier
>>to
>> >> >>reason
>> >> >> >> >>about
>> >> >> >> >> > and
>> >> >> >> >> > > > the
>> >> >> >> >> > > > > > total
>> >> >> >> >> > > > > > > > > >>number o mm instances is always going to
>>be
>> >> >>small.
>> >> >> >> >> > > > > > > > > >>
>> >> >> >> >> > > > > > > > > >>Let's talk about where this simple
>>approach
>> >> >>falls
>> >> >> >> >>short,
>> >> >> >> >> I
>> >> >> >> >> > > > think
>> >> >> >> >> > > > > > that
>> >> >> >> >> >  > > > > > > >>will
>> >> >> >> >> > > > > > > > > >>help us understand your motivations for
>> >> >>additional
>> >> >> >> >> > elements.
>> >> >> >> >>  > > > > > > > >>
>> >> >> >> >> > > > > > > > > >>Another advantage of this is that it is so
>> >> >>simple I
>> >> >> >> >>don't
>> >> >> >> >> > > > think we
>> >> >> >> >> > > > > > > > really
>> >> >> >> >> > > > > > > > > >>even need to both making mm extensible
>> >>because
>> >> >> >>writing
>> >> >> >> >> > your own
>> >> >> >> >> > > > > > code
>> >> >> >> >> > > > > > > > that
>> >> >> >> >> > > > > > > > > >>does custom processing or transformation
>>is
>> >>just
>> >> >> >>ten
>> >> >> >> >> lines
>> >> >> >> >> > and
>> >> >> >> >> > > > no
>> >> >> >> >> > > > > > plug
>> >> >> >> >> > > > > > > > in
>> >> >> >> >> > > > > > > > > >>system is going to make it simpler.
>> >> >> >> >> > > > > > > > > >>
>> >> >> >> >> > > > > > > > > >>-Jay
>> >> >> >> >> > > > > > > > > >>
>> >> >> >> >> > > > > > > > > >>
>> >> >> >> >> > > > > > > > > >>On Sun, Feb 8, 2015 at 2:40 PM, Neha
>> >>Narkhede <
>> >> >> >> >> > > > n...@confluent.io>
>> >> >> >> >> > > > > > > > wrote:
>> >> >> >> >> > > > > > > > > >>
>> >> >> >> >> > > > > > > > > >>> Few comments -
>> >> >> >> >> > > > > > > > > >>>
>> >> >> >> >> > > > > > > > > >>> 1. Why do we need the message handler?
>>Do
>> >>you
>> >> >> >>have
>> >> >> >> >> > concrete
>> >> >> >> >> > > > use
>> >> >> >> >> > > > > > cases
>> >> >> >> >> > > > > > > > > >>>in
>> >> >> >> >> > > > > > > > > >>> mind? If not, we should consider adding
>>it
>> >>in
>> >> >>the
>> >> >> >> >> future
>> >> >> >> >> > > > when/if
>> >> >> >> >> > > > > > we
>> >> >> >> >> > > > > > > > do
>> >> >> >> >> > > > > > > > > >>>have
>> >> >> >> >> > > > > > > > > >>> use cases for it. The purpose of the
>>mirror
>> >> >>maker
>> >> >> >> >>is a
>> >> >> >> >> > simple
>> >> >> >> >> > > > > > tool
>> >> >> >> >> > > > > > > > for
>> >> >> >> >> > > > > > > > > >>> setting up Kafka cluster replicas. I
>>don't
>> >>see
>> >> >> >>why
>> >> >> >> >>we
>> >> >> >> >> > need to
>> >> >> >> >> > > > > > > > include a
>> >> >> >> >> > > > > > > > > >>> message handler for doing stream
>> >> >>transformations
>> >> >> >>or
>> >> >> >> >> > > > filtering.
>> >> >> >> >> > > > > > You
>> >> >> >> >> > > > > > > > can
>> >> >> >> >> > > > > > > > > >>> always write a simple process for doing
>> >>that
>> >> >>once
>> >> >> >> >>the
>> >> >> >> >> > data is
>> >> >> >> >> > > > > > copied
>> >> >> >> >> > > > > > > > as
>> >> >> >> >> > > > > > > > > >>>is
>> >> >> >> >> > > > > > > > > >>> in the target cluster
>> >> >> >> >> > > > > > > > > >>> 2. Why keep both designs? We should 
>>prefer
>> >>the
>> >> >> >> >>simpler
>> >> >> >> >> > design
>> >> >> >> >> > > > > > unless
>> >> >> >> >> > > > > > > > it
>> >> >> >> >> > > > > > > > > >>>is
>> >> >> >> >> > > > > > > > > >>> not feasible due to the performance 
>>issue
>> >> >>that we
>> >> >> >> >> > previously
>> >> >> >> >> > > > > > had. Did
>> >> >> >> >> > > > > > > > > >>>you
>> >> >> >> >> > > > > > > > > >>> get a chance to run some tests to see if
>> >>that
>> >> >>is
>> >> >> >> >>really
>> >> >> >> >> > > > still a
>> >> >> >> >> > > > > > > > problem
>> >> >> >> >> > > > > > > > > >>>or
>> >> >> >> >> > > > > > > > > >>> not? It will be easier to think about 
>>the
>> >> >>design
>> >> >> >>and
>> >> >> >> >> also
>> >> >> >> >> > > > make
>> >> >> >> >> > > > > > the
>> >> >> >> >> > > > > > > > KIP
>> >> >> >> >> > > > > > > > > >>> complete if we make a call on the design
>> >> >>first.
>> >> >> >> >> > > > > > > > > >>> 3. Can you explain the need for keeping 
>>a
>> >> >>list of
>> >> >> >> >> unacked
>> >> >> >> >> > > > > > offsets per
>> >> >> >> >> > > > > > > > > >>> partition? Consider adding a section on
>> >> >>retries
>> >> >> >>and
>> >> >> >> >>how
>> >> >> >> >> > you
>> >> >> >> >> > > > plan
>> >> >> >> >> > > > > > to
>> >> >> >> >> > > > > > > > > >>>handle
>> >> >> >> >> > > > > > > > > >>> the case when the producer runs out of 
>>all
>> >> >> >>retries.
>> >> >> >> >> > > > > > > > > >>>
>> >> >> >> >> > > > > > > > > >>> Thanks,
>> >> >> >> >> > > > > > > > > >>> Neha
>> >> >> >> >> > > > > > > > > >>>
>> >> >> >> >> > > > > > > > > >>> On Sun, Feb 8, 2015 at 2:06 PM, Jiangjie
>> >>Qin
>> >> >> >> >> > > > > > > > > >>><j...@linkedin.com.invalid>
>> >> >> >> >> > > > > > > > > >>> wrote:
>> >> >> >> >> > > > > > > > > >>>
>> >> >> >> >> > > > > > > > > >>> > Hi Neha,
>> >> >> >> >> > > > > > > > > >>> >
>> >> >> >> >> > > > > > > > > >>> > Yes, I’ve updated the KIP so the 
>>entire
>> >>KIP
>> >> >>is
>> >> >> >> >>based
>> >> >> >> >> > on new
>> >> >> >> >> > > > > > > > consumer
>> >> >> >> >> > > > > > > > > >>>now.
>> >> >> >> >> > > > > > > > > >>> > I’ve put both designs with and without
>> >>data
>> >> >> >> >>channel
>> >> >> >> >> in
>> >> >> >> >> > the
>> >> >> >> >> > > > KIP
>> >> >> >> >> > > > > > as I
>> >> >> >> >> > > > > > > > > >>>still
>> >> >> >> >> > > > > > > > > >>> > feel we might need the data channel to
>> >> >>provide
>> >> >> >> >>more
>> >> >> >> >> > > > > > flexibility,
>> >> >> >> >> > > > > > > > > >>> > especially after message handler is
>> >> >>introduced.
>> >> >> >> >>I’ve
>> >> >> >> >> > put my
>> >> >> >> >> > > > > > > > thinking
>> >> >> >> >> > > > > > > > > >>>of
>> >> >> >> >> > > > > > > > > >>> > the pros and cons of the two designs 
>>in
>> >>the
>> >> >> >>KIP as
>> >> >> >> >> > well.
>> >> >> >> >> > > > It’ll
>> >> >> >> >> > > > > > be
>> >> >> >> >> > > > > > > > > >>>great
>> >> >> >> >> > > > > > > > > >>> if
>> >> >> >> >> > > > > > > > > >>> > you can give a review and comment.
>> >> >> >> >> > > > > > > > > >>> >
>> >> >> >> >> > > > > > > > > >>> > Thanks.
>> >> >> >> >> > > > > > > > > >>> >
>> >> >> >> >> > > > > > > > > >>> > Jiangjie (Becket) Qin
>> >> >> >> >> > > > > > > > > >>> >
>> >> >> >> >> > > > > > > > > >>> > On 2/6/15, 7:30 PM, "Neha Narkhede" <
>> >> >> >> >> n...@confluent.io
>> >> >> >> >> > >
>> >> >> >> >> > > > wrote:
>> >> >> >> >> > > > > > > > > >>> >
>> >> >> >> >> > > > > > > > > >>> > >Hey Becket,
>> >> >> >> >> > > > > > > > > >>> > >
>> >> >> >> >> > > > > > > > > >>> > >What are the next steps on this KIP. 
>>As
>> >>per
>> >> >> >>your
>> >> >> >> >> > comment
>> >> >> >> >> > > > > > earlier
>> >> >> >> >> > > > > > > > on
>> >> >> >> >> > > > > > > > > >>>the
>> >> >> >> >> > > > > > > > > >>> > >thread -
>> >> >> >> >> > > > > > > > > >>> > >
>> >> >> >> >> > > > > > > > > >>> > >I do agree it makes more sense
>> >> >> >> >> > > > > > > > > >>> > >> to avoid duplicate effort and plan
>> >>based
>> >> >>on
>> >> >> >>new
>> >> >> >> >> > > > consumer.
>> >> >> >> >> > > > > > I’ll
>> >> >> >> >> > > > > > > > > >>>modify
>> >> >> >> >> > > > > > > > > >>> > >>the
>> >> >> >> >> > > > > > > > > >>> > >> KIP.
>> >> >> >> >> > > > > > > > > >>> > >
>> >> >> >> >> > > > > > > > > >>> > >
>> >> >> >> >> > > > > > > > > >>> > >Did you get a chance to think about 
>>the
>> >> >> >> >>simplified
>> >> >> >> >> > design
>> >> >> >> >> > > > > > that we
>> >> >> >> >> > > > > > > > > >>> proposed
>> >> >> >> >> > > > > > > > > >>> > >earlier? Do you plan to update the 
>>KIP
>> >>with
>> >> >> >>that
>> >> >> >> >> > proposal?
>> >> >> >> >> > > > > > > > > >>> > >
>> >> >> >> >> > > > > > > > > >>> > >Thanks,
>> >> >> >> >> > > > > > > > > >>> > >Neha
>> >> >> >> >> > > > > > > > > >>> > >
>> >> >> >> >> > > > > > > > > >>> > >On Wed, Feb 4, 2015 at 12:12 PM,
>> >>Jiangjie
>> >> >>Qin
>> >> >> >> >> > > > > > > > > >>><j...@linkedin.com.invalid
>> >> >> >> >> > > > > > > > > >>> >
>> >> >> >> >> > > > > > > > > >>> > >wrote:
>> >> >> >> >> > > > > > > > > >>> > >
>> >> >> >> >> > > > > > > > > >>> > >> In mirror maker we do not do
>> >> >> >>de-serialization
>> >> >> >> >>on
>> >> >> >> >> the
>> >> >> >> >> > > > > > messages.
>> >> >> >> >> > > > > > > > > >>>Mirror
>> >> >> >> >> > > > > > > > > >>> > >> maker use source TopicPartition 
>>hash
>> >>to
>> >> >> >>chose a
>> >> >> >> >> > > > producer to
>> >> >> >> >> > > > > > send
>> >> >> >> >> > > > > > > > > >>> > >>messages
>> >> >> >> >> > > > > > > > > >>> > >> from the same source partition. The
>> >> >> >>partition
>> >> >> >> >> those
>> >> >> >> >> > > > > > messages end
>> >> >> >> >> > > > > > > > > >>>up
>> >> >> >> >> > > > > > > > > >>> with
>> >> >> >> >> > > > > > > > > >>> > >> are decided by Partitioner class in
>> >> >> >> >>KafkaProducer
>> >> >> >> >> > > > (assuming
>> >> >> >> >> > > > > > you
>> >> >> >> >> > > > > > > > > >>>are
>> >> >> >> >> > > > > > > > > >>> > >>using
>> >> >> >> >> > > > > > > > > >>> > >> the new producer), which uses hash
>> >>code
>> >> >>of
>> >> >> >> >> bytes[].
>> >> >> >> >> > > > > > > > > >>> > >>
>> >> >> >> >> > > > > > > > > >>> > >> If deserialization is needed, it 
>>has
>> >>to
>> >> >>be
>> >> >> >> >>done in
>> >> >> >> >> > > > message
>> >> >> >> >> > > > > > > > > >>>handler.
>> >> >> >> >> > > > > > > > > >>> > >>
>> >> >> >> >> > > > > > > > > >>> > >> Thanks.
>> >> >> >> >> > > > > > > > > >>> > >>
>> >> >> >> >> > > > > > > > > >>> > >> Jiangjie (Becket) Qin
>> >> >> >> >> > > > > > > > > >>> > >>
>> >> >> >> >> > > > > > > > > >>> > >> On 2/4/15, 11:33 AM, "Bhavesh 
>>Mistry"
>> >><
>> >> >> >> >> > > > > > > > mistry.p.bhav...@gmail.com>
>> >> >> >> >> > > > > > > > > >>> > >>wrote:
>> >> >> >> >> > > > > > > > > >>> > >>
>> >> >> >> >> > > > > > > > > >>> > >> >Hi Jiangjie,
>> >> >> >> >> > > > > > > > > >>> > >> >
>> >> >> >> >> > > > > > > > > >>> > >> >Thanks for entertaining my 
>>question
>> >>so
>> >> >>far.
>> >> >> >> >>Last
>> >> >> >> >> > > > > > question, I
>> >> >> >> >> > > > > > > > > >>>have is
>> >> >> >> >> > > > > > > > > >>> > >> >about
>> >> >> >> >> > > > > > > > > >>> > >> >serialization of message key.  If 
>>the
>> >> >>key
>> >> >> >> >> > > > de-serialization
>> >> >> >> >> > > > > > > > > >>>(Class) is
>> >> >> >> >> > > > > > > > > >>> > >>not
>> >> >> >> >> > > > > > > > > >>> > >> >present at the MM instance, then
>> >>does it
>> >> >> >>use
>> >> >> >> >>raw
>> >> >> >> >> > byte
>> >> >> >> >> > > > > > hashcode
>> >> >> >> >> > > > > > > > to
>> >> >> >> >> > > > > > > > > >>> > >> >determine
>> >> >> >> >> > > > > > > > > >>> > >> >the partition ?  How are you 
>>going to
>> >> >> >>address
>> >> >> >> >>the
>> >> >> >> >> > > > situation
>> >> >> >> >> > > > > > > > where
>> >> >> >> >> > > > > > > > > >>>key
>> >> >> >> >> > > > > > > > > >>> > >> >needs
>> >> >> >> >> > > > > > > > > >>> > >> >to be de-serialization and get 
>>actual
>> >> >> >>hashcode
>> >> >> >> >> > needs
>> >> >> >> >> > > > to be
>> >> >> >> >> > > > > > > > > >>>computed
>> >> >> >> >> > > > > > > > > >>> ?.
>> >> >> >> >> > > > > > > > > >>> > >> >
>> >> >> >> >> > > > > > > > > >>> > >> >
>> >> >> >> >> > > > > > > > > >>> > >> >Thanks,
>> >> >> >> >> > > > > > > > > >>> > >> >
>> >> >> >> >> > > > > > > > > >>> > >> >Bhavesh
>> >> >> >> >> > > > > > > > > >>> > >> >
>> >> >> >> >> > > > > > > > > >>> > >> >On Fri, Jan 30, 2015 at 1:41 PM,
>> >> >>Jiangjie
>> >> >> >>Qin
>> >> >> >> >> > > > > > > > > >>> > >><j...@linkedin.com.invalid>
>> >> >> >> >> > > > > > > > > >>> > >> >wrote:
>> >> >> >> >> > > > > > > > > >>> > >> >
>> >> >> >> >> > > > > > > > > >>> > >> >> Hi Bhavesh,
>> >> >> >> >> > > > > > > > > >>> > >> >>
>> >> >> >> >> > > > > > > > > >>> > >> >> Please see inline comments.
>> >> >> >> >> > > > > > > > > >>> > >> >>
>> >> >> >> >> > > > > > > > > >>> > >> >> Jiangjie (Becket) Qin
>> >> >> >> >> > > > > > > > > >>> > >> >>
>> >> >> >> >> > > > > > > > > >>> > >> >> On 1/29/15, 7:00 PM, "Bhavesh
>> >>Mistry"
>> >> >> >> >> > > > > > > > > >>><mistry.p.bhav...@gmail.com>
>> >> >> >> >> > > > > > > > > >>> > >> >>wrote:
>> >> >> >> >> > > > > > > > > >>> > >> >>
>> >> >> >> >> > > > > > > > > >>> > >> >> >Hi Jiangjie,
>> >> >> >> >> > > > > > > > > >>> > >> >> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >Thanks for the input.
>> >> >> >> >> > > > > > > > > >>> > >> >> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >a) Is MM will  producer ack 
>>will
>> >>be
>> >> >> >>attach
>> >> >> >> >>to
>> >> >> >> >> > > > Producer
>> >> >> >> >> > > > > > > > > >>>Instance or
>> >> >> >> >> > > > > > > > > >>> > >>per
>> >> >> >> >> > > > > > > > > >>> > >> >> >topic.  Use case is that one
>> >>instance
>> >> >> >>of MM
>> >> >> >> >> > > > > > > > > >>> > >> >> >needs to handle both strong ack
>> >>and
>> >> >>also
>> >> >> >> >>ack=0
>> >> >> >> >> > for
>> >> >> >> >> > > > some
>> >> >> >> >> > > > > > > > topic.
>> >> >> >> >> > > > > > > > > >>> Or
>> >> >> >> >> > > > > > > > > >>> > >>it
>> >> >> >> >> > > > > > > > > >>> > >> >> >would
>> >> >> >> >> > > > > > > > > >>> > >> >> >be better to set-up another
>> >>instance
>> >> >>of
>> >> >> >>MM.
>> >> >> >> >> > > > > > > > > >>> > >> >> The acks setting is producer 
>>level
>> >> >> >>setting
>> >> >> >> >> > instead of
>> >> >> >> >> > > > > > topic
>> >> >> >> >> > > > > > > > > >>>level
>> >> >> >> >> > > > > > > > > >>> > >> >>setting.
>> >> >> >> >> > > > > > > > > >>> > >> >> In this case you probably need 
>>to
>> >>set
>> >> >>up
>> >> >> >> >> another
>> >> >> >> >> > > > > > instance.
>> >> >> >> >> > > > > > > > > >>> > >> >> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >b) Regarding TCP connections, 
>>Why
>> >> >>does
>> >> >> >> >> #producer
>> >> >> >> >> > > > > > instance
>> >> >> >> >> > > > > > > > > >>>attach
>> >> >> >> >> > > > > > > > > >>> to
>> >> >> >> >> > > > > > > > > >>> > >>TCP
>> >> >> >> >> > > > > > > > > >>> > >> >> >connection.  Is it possible to 
>>use
>> >> >> >>Broker
>> >> >> >> >> > > > Connection TCP
>> >> >> >> >> > > > > > > > Pool,
>> >> >> >> >> > > > > > > > > >>> > >>producer
>> >> >> >> >> > > > > > > > > >>> > >> >> >will just checkout TCP 
>>connection
>> >> to
>> >> >> >> >>Broker.
>> >> >> >> >> > So,
>> >> >> >> >> > > > # of
>> >> >> >> >> > > > > > > > > >>>Producer
>> >> >> >> >> > > > > > > > > >>> > >> >>Instance
>> >> >> >> >> > > > > > > > > >>> > >> >> >does not correlation to Brokers
>> >> >> >>Connection.
>> >> >> >> >> Is
>> >> >> >> >> > this
>> >> >> >> >> > > > > > > > possible
>> >> >> >> >> > > > > > > > > >>>?
>> >> >> >> >> > > > > > > > > >>> > >> >> In new producer, each producer
>> >> >>maintains
>> >> >> >>a
>> >> >> >> >> > > > connection to
>> >> >> >> >> > > > > > each
>> >> >> >> >> > > > > > > > > >>> broker
>> >> >> >> >> > > > > > > > > >>> > >> >> within the producer instance.
>> >>Making
>> >> >> >> >>producer
>> >> >> >> >> > > > instances
>> >> >> >> >> > > > > > to
>> >> >> >> >> > > > > > > > > >>>share
>> >> >> >> >> > > > > > > > > >>> the
>> >> >> >> >> > > > > > > > > >>> > >>TCP
>> >> >> >> >> > > > > > > > > >>> > >> >> connections is a very big 
>>change to
>> >> >>the
>> >> >> >> >>current
>> >> >> >> >> > > > design,
>> >> >> >> >> > > > > > so I
>> >> >> >> >> > > > > > > > > >>> suppose
>> >> >> >> >> > > > > > > > > >>> > >>we
>> >> >> >> >> > > > > > > > > >>> > >> >> won’t be able to do that.
>> >> >> >> >> > > > > > > > > >>> > >> >> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >Thanks,
>> >> >> >> >> > > > > > > > > >>> > >> >> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >Bhavesh
>> >> >> >> >> > > > > > > > > >>> > >> >> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >On Thu, Jan 29, 2015 at 11:50 
>>AM,
>> >> >> >>Jiangjie
>> >> >> >> >>Qin
>> >> >> >> >> > > > > > > > > >>> > >> >><j...@linkedin.com.invalid
>> >> >> >> >> > > > > > > > > >>> > >> >> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >wrote:
>> >> >> >> >> > > > > > > > > >>> > >> >> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> Hi Bhavesh,
>> >> >> >> >> > > > > > > > > >>> > >> >> >>
>> >> >> >> >> > > > > > > > > >>> > >> >> >> I think it is the right
>> >>discussion
>> >> >>to
>> >> >> >> >>have
>> >> >> >> >> > when
>> >> >> >> >> > > > we are
>> >> >> >> >> > > > > > > > > >>>talking
>> >> >> >> >> > > > > > > > > >>> > >>about
>> >> >> >> >> > > > > > > > > >>> > >> >>the
>> >> >> >> >> > > > > > > > > >>> > >> >> >> new new design for MM.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> Please see the inline 
>>comments.
>> >> >> >> >> > > > > > > > > >>> > >> >> >>
>> >> >> >> >> > > > > > > > > >>> > >> >> >> Jiangjie (Becket) Qin
>> >> >> >> >> > > > > > > > > >>> > >> >> >>
>> >> >> >> >> > > > > > > > > >>> > >> >> >> On 1/28/15, 10:48 PM, 
>>"Bhavesh
>> >> >>Mistry"
>> >> >> >> >> > > > > > > > > >>> > >><mistry.p.bhav...@gmail.com>
>> >> >> >> >> > > > > > > > > >>> > >> >> >>wrote:
>> >> >> >> >> > > > > > > > > >>> > >> >> >>
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >Hi Jiangjie,
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >I just wanted to let you 
>>know
>> >> >>about
>> >> >> >>our
>> >> >> >> >>use
>> >> >> >> >> > case
>> >> >> >> >> > > > and
>> >> >> >> >> > > > > > > > stress
>> >> >> >> >> > > > > > > > > >>>the
>> >> >> >> >> > > > > > > > > >>> > >> >>point
>> >> >> >> >> > > > > > > > > >>> > >> >> >>that
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >local data center broker
>> >>cluster
>> >> >>have
>> >> >> >> >>fewer
>> >> >> >> >> > > > > > partitions
>> >> >> >> >> > > > > > > > than
>> >> >> >> >> > > > > > > > > >>>the
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >destination
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >offline broker cluster. Just
>> >> >>because
>> >> >> >>we
>> >> >> >> >>do
>> >> >> >> >> > the
>> >> >> >> >> > > > batch
>> >> >> >> >> > > > > > pull
>> >> >> >> >> > > > > > > > > >>>from
>> >> >> >> >> > > > > > > > > >>> > >>CAMUS
>> >> >> >> >> > > > > > > > > >>> > >> >> >>and
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >in
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >order to drain data faster 
>>than
>> >> >>the
>> >> >> >> >> injection
>> >> >> >> >> > > > rate
>> >> >> >> >> > > > > > (from
>> >> >> >> >> > > > > > > > > >>>four
>> >> >> >> >> > > > > > > > > >>> DCs
>> >> >> >> >> > > > > > > > > >>> > >> >>for
>> >> >> >> >> > > > > > > > > >>> > >> >> >>same
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >topic).
>> >> >> >> >> > > > > > > > > >>> > >> >> >> Keeping the same partition
>> >>number
>> >> >>in
>> >> >> >> >>source
>> >> >> >> >> > and
>> >> >> >> >> > > > target
>> >> >> >> >> > > > > > > > > >>>cluster
>> >> >> >> >> > > > > > > > > >>> > >>will
>> >> >> >> >> > > > > > > > > >>> > >> >>be
>> >> >> >> >> > > > > > > > > >>> > >> >> >>an
>> >> >> >> >> > > > > > > > > >>> > >> >> >> option but will not be 
>>enforced
>> >>by
>> >> >> >> >>default.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >We are facing following 
>>issues
>> >> >> >>(probably
>> >> >> >> >> due
>> >> >> >> >> > to
>> >> >> >> >> > > > > > > > > >>>configuration):
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >1)      We occasionally 
>>loose
>> >>data
>> >> >> >>due
>> >> >> >> >>to
>> >> >> >> >> > message
>> >> >> >> >> > > > > > batch
>> >> >> >> >> > > > > > > > > >>>size is
>> >> >> >> >> > > > > > > > > >>> > >>too
>> >> >> >> >> > > > > > > > > >>> > >> >> >>large
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >(2MB) on target data (we are
>> >>using
>> >> >> >>old
>> >> >> >> >> > producer
>> >> >> >> >> > > > but I
>> >> >> >> >> > > > > > > > think
>> >> >> >> >> > > > > > > > > >>>new
>> >> >> >> >> > > > > > > > > >>> > >> >> >>producer
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >will solve this problem to 
>>some
>> >> >> >>extend).
>> >> >> >> >> > > > > > > > > >>> > >> >> >> We do see this issue in
>> >>LinkedIn as
>> >> >> >>well.
>> >> >> >> >> New
>> >> >> >> >> > > > producer
>> >> >> >> >> > > > > > > > also
>> >> >> >> >> > > > > > > > > >>> might
>> >> >> >> >> > > > > > > > > >>> > >> >>have
>> >> >> >> >> > > > > > > > > >>> > >> >> >> this issue. There are some
>> >> >>proposal of
>> >> >> >> >> > solutions,
>> >> >> >> >> > > > but
>> >> >> >> >> > > > > > no
>> >> >> >> >> > > > > > > > > >>>real
>> >> >> >> >> > > > > > > > > >>> work
>> >> >> >> >> > > > > > > > > >>> > >> >> >>started
>> >> >> >> >> > > > > > > > > >>> > >> >> >> yet. For now, as a 
>>workaround,
>> >> >> >>setting a
>> >> >> >> >> more
>> >> >> >> >> > > > > > aggressive
>> >> >> >> >> > > > > > > > > >>>batch
>> >> >> >> >> > > > > > > > > >>> > >>size
>> >> >> >> >> > > > > > > > > >>> > >> >>on
>> >> >> >> >> > > > > > > > > >>> > >> >> >> producer side should work.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >2)      Since only one
>> >>instance is
>> >> >> >>set
>> >> >> >> >>to
>> >> >> >> >> MM
>> >> >> >> >> > > > data,
>> >> >> >> >> > > > > > we
>> >> >> >> >> > > > > > > > are
>> >> >> >> >> > > > > > > > > >>>not
>> >> >> >> >> > > > > > > > > >>> > >>able
>> >> >> >> >> > > > > > > > > >>> > >> >>to
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >set-up ack per topic instead
>> >>ack
>> >> >>is
>> >> >> >> >> attached
>> >> >> >> >> > to
>> >> >> >> >> > > > > > producer
>> >> >> >> >> > > > > > > > > >>> > >>instance.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> I don’t quite get the 
>>question
>> >> >>here.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >3)      How are you going to
>> >> >>address
>> >> >> >>two
>> >> >> >> >> > phase
>> >> >> >> >> > > > commit
>> >> >> >> >> > > > > > > > > >>>problem
>> >> >> >> >> > > > > > > > > >>> if
>> >> >> >> >> > > > > > > > > >>> > >> >>ack is
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >set
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >to strongest, but auto 
>>commit
>> >>is
>> >> >>on
>> >> >> >>for
>> >> >> >> >> > consumer
>> >> >> >> >> > > > > > (meaning
>> >> >> >> >> > > > > > > > > >>> > >>producer
>> >> >> >> >> > > > > > > > > >>> > >> >>does
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >not
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >get ack,  but consumer auto
>> >> >>committed
>> >> >> >> >> offset
>> >> >> >> >> > that
>> >> >> >> >> > > > > > > > message).
>> >> >> >> >> > > > > > > > > >>> Is
>> >> >> >> >> > > > > > > > > >>> > >> >>there
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >transactional (Kafka
>> >>transaction
>> >> >>is
>> >> >> >>in
>> >> >> >> >> > process)
>> >> >> >> >> > > > > > based ack
>> >> >> >> >> > > > > > > > > >>>and
>> >> >> >> >> > > > > > > > > >>> > >>commit
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >offset
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >?
>> >> >> >> >> > > > > > > > > >>> > >> >> >> Auto offset commit should be
>> >>turned
>> >> >> >>off
>> >> >> >> >>in
>> >> >> >> >> > this
>> >> >> >> >> > > > case.
>> >> >> >> >> > > > > > The
>> >> >> >> >> > > > > > > > > >>>offset
>> >> >> >> >> > > > > > > > > >>> > >>will
>> >> >> >> >> > > > > > > > > >>> > >> >> >>only
>> >> >> >> >> > > > > > > > > >>> > >> >> >> be committed once by the 
>>offset
>> >> >>commit
>> >> >> >> >> > thread. So
>> >> >> >> >> > > > > > there is
>> >> >> >> >> > > > > > > > > >>>no
>> >> >> >> >> > > > > > > > > >>> two
>> >> >> >> >> > > > > > > > > >>> > >> >>phase
>> >> >> >> >> > > > > > > > > >>> > >> >> >> commit.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >4)      How are you 
>>planning to
>> >> >>avoid
>> >> >> >> >> > duplicated
>> >> >> >> >> > > > > > message?
>> >> >> >> >> > > > > > > > > >>>( Is
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >brokergoing
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >have moving window of 
>>message
>> >> >> >>collected
>> >> >> >> >>and
>> >> >> >> >> > > > de-dupe
>> >> >> >> >> > > > > > ?)
>> >> >> >> >> > > > > > > > > >>> > >>Possibly, we
>> >> >> >> >> > > > > > > > > >>> > >> >> >>get
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >this from retry set to 5…?
>> >> >> >> >> > > > > > > > > >>> > >> >> >> We are not trying to 
>>completely
>> >> >>avoid
>> >> >> >> >> > duplicates.
>> >> >> >> >> > > > The
>> >> >> >> >> > > > > > > > > >>>duplicates
>> >> >> >> >> > > > > > > > > >>> > >>will
>> >> >> >> >> > > > > > > > > >>> > >> >> >> still be there if:
>> >> >> >> >> > > > > > > > > >>> > >> >> >> 1. Producer retries on 
>>failure.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> 2. Mirror maker is hard 
>>killed.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> Currently, dedup is expected 
>>to
>> >>be
>> >> >> >>done
>> >> >> >> >>by
>> >> >> >> >> > user if
>> >> >> >> >> > > > > > > > > >>>necessary.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >5)      Last, is there any
>> >> >>warning or
>> >> >> >> >>any
>> >> >> >> >> > thing
>> >> >> >> >> > > > you
>> >> >> >> >> > > > > > can
>> >> >> >> >> > > > > > > > > >>>provide
>> >> >> >> >> > > > > > > > > >>> > >> >>insight
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >from MM component about data
>> >> >> >>injection
>> >> >> >> >>rate
>> >> >> >> >> > into
>> >> >> >> >> > > > > > > > > >>>destination
>> >> >> >> >> > > > > > > > > >>> > >> >> >>partitions is
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >NOT evenly distributed
>> >>regardless
>> >> >> of
>> >> >> >> >> keyed
>> >> >> >> >> > or
>> >> >> >> >> > > > > > non-keyed
>> >> >> >> >> > > > > > > > > >>> message
>> >> >> >> >> > > > > > > > > >>> > >> >> >>(Hence
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >there is ripple effect such 
>>as
>> >> >>data
>> >> >> >>not
>> >> >> >> >> > arriving
>> >> >> >> >> > > > > > late, or
>> >> >> >> >> > > > > > > > > >>>data
>> >> >> >> >> > > > > > > > > >>> is
>> >> >> >> >> > > > > > > > > >>> > >> >> >>arriving
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >out of order in  intern of 
>>time
>> >> >>stamp
>> >> >> >> >>and
>> >> >> >> >> > early
>> >> >> >> >> > > > some
>> >> >> >> >> > > > > > > > time,
>> >> >> >> >> > > > > > > > > >>>and
>> >> >> >> >> > > > > > > > > >>> > >> >>CAMUS
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >creates huge number of file
>> >>count
>> >> >>on
>> >> >> >> >>HDFS
>> >> >> >> >> > due to
>> >> >> >> >> > > > > > uneven
>> >> >> >> >> > > > > > > > > >>> injection
>> >> >> >> >> > > > > > > > > >>> > >> >>rate
>> >> >> >> >> > > > > > > > > >>> > >> >> >>.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >Camus Job is  configured to 
>>run
>> >> >> >>every 3
>> >> >> >> >> > minutes.)
>> >> >> >> >> > > > > > > > > >>> > >> >> >> I think uneven data
>> >>distribution is
>> >> >> >> >> typically
>> >> >> >> >> > > > caused
>> >> >> >> >> > > > > > by
>> >> >> >> >> > > > > > > > > >>>server
>> >> >> >> >> > > > > > > > > >>> > >>side
>> >> >> >> >> > > > > > > > > >>> > >> >> >> unbalance, instead of 
>>something
>> >> >>mirror
>> >> >> >> >>maker
>> >> >> >> >> > could
>> >> >> >> >> > > > > > > > control.
>> >> >> >> >> > > > > > > > > >>>In
>> >> >> >> >> > > > > > > > > >>> new
>> >> >> >> >> > > > > > > > > >>> > >> >> >>mirror
>> >> >> >> >> > > > > > > > > >>> > >> >> >> maker, however, there is a
>> >> >> >>customizable
>> >> >> >> >> > message
>> >> >> >> >> > > > > > handler,
>> >> >> >> >> > > > > > > > > >>>that
>> >> >> >> >> > > > > > > > > >>> > >>might
>> >> >> >> >> > > > > > > > > >>> > >> >>be
>> >> >> >> >> > > > > > > > > >>> > >> >> >> able to help a little bit. In
>> >> >>message
>> >> >> >> >> handler,
>> >> >> >> >> > > > you can
>> >> >> >> >> > > > > > > > > >>> explicitly
>> >> >> >> >> > > > > > > > > >>> > >> >>set a
>> >> >> >> >> > > > > > > > > >>> > >> >> >> partition that you want to
>> >>produce
>> >> >>the
>> >> >> >> >> message
>> >> >> >> >> > > > to. So
>> >> >> >> >> > > > > > if
>> >> >> >> >> > > > > > > > you
>> >> >> >> >> > > > > > > > > >>> know
>> >> >> >> >> > > > > > > > > >>> > >>the
>> >> >> >> >> > > > > > > > > >>> > >> >> >> uneven data distribution in
>> >>target
>> >> >> >> >>cluster,
>> >> >> >> >> > you
>> >> >> >> >> > > > may
>> >> >> >> >> > > > > > offset
>> >> >> >> >> > > > > > > > > >>>it
>> >> >> >> >> > > > > > > > > >>> > >>here.
>> >> >> >> >> > > > > > > > > >>> > >> >>But
>> >> >> >> >> > > > > > > > > >>> > >> >> >> that probably only works for
>> >> >>non-keyed
>> >> >> >> >> > messages.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >I am not sure if this is 
>>right
>> >> >> >> >>discussion
>> >> >> >> >> > form to
>> >> >> >> >> > > > > > bring
>> >> >> >> >> > > > > > > > > >>>these
>> >> >> >> >> > > > > > > > > >>> to
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >your/kafka
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >Dev team attention.  This
>> >>might be
>> >> >> >>off
>> >> >> >> >> track,
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >Thanks,
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >Bhavesh
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >On Wed, Jan 28, 2015 at 
>>11:07
>> >>AM,
>> >> >> >> >>Jiangjie
>> >> >> >> >> > Qin
>> >> >> >> >> > > > > > > > > >>> > >> >> >><j...@linkedin.com.invalid
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >wrote:
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> I’ve updated the KIP page.
>> >> >> >>Feedbacks
>> >> >> >> >>are
>> >> >> >> >> > > > welcome.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> Regarding the simple 
>>mirror
>> >> >>maker
>> >> >> >> >> design. I
>> >> >> >> >> > > > thought
>> >> >> >> >> > > > > > > > over
>> >> >> >> >> > > > > > > > > >>>it
>> >> >> >> >> > > > > > > > > >>> and
>> >> >> >> >> > > > > > > > > >>> > >> >>have
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>some
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> worries:
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> There are two things that
>> >>might
>> >> >> >>worth
>> >> >> >> >> > thinking:
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> 1. One of the enhancement 
>>to
>> >> >>mirror
>> >> >> >> >>maker
>> >> >> >> >> > is
>> >> >> >> >> > > > > > adding a
>> >> >> >> >> > > > > > > > > >>>message
>> >> >> >> >> > > > > > > > > >>> > >> >> >>handler to
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> do things like 
>>reformatting.
>> >>I
>> >> >> >>think
>> >> >> >> >>we
>> >> >> >> >> > might
>> >> >> >> >> > > > > > > > potentially
>> >> >> >> >> > > > > > > > > >>> want
>> >> >> >> >> > > > > > > > > >>> > >>to
>> >> >> >> >> > > > > > > > > >>> > >> >> >>have
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> more threads processing 
>>the
>> >> >> >>messages
>> >> >> >> >>than
>> >> >> >> >> > the
>> >> >> >> >> > > > > > number of
>> >> >> >> >> > > > > > > > > >>> > >>consumers.
>> >> >> >> >> > > > > > > > > >>> > >> >> >>If we
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> follow the simple mirror
>> >>maker
>> >> >> >> >>solution,
>> >> >> >> >> we
>> >> >> >> >> > > > lose
>> >> >> >> >> > > > > > this
>> >> >> >> >> > > > > > > > > >>> > >>flexibility.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> 2. This might not matter 
>>too
>> >> >>much,
>> >> >> >>but
>> >> >> >> >> > creating
>> >> >> >> >> > > > > > more
>> >> >> >> >> > > > > > > > > >>> consumers
>> >> >> >> >> > > > > > > > > >>> > >> >>means
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>more
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> footprint of TCP 
>>connection /
>> >> >> >>memory.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> Any thoughts on this?
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> Thanks.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> Jiangjie (Becket) Qin
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> On 1/26/15, 10:35 AM,
>> >>"Jiangjie
>> >> >> >>Qin" <
>> >> >> >> >> > > > > > > > j...@linkedin.com>
>> >> >> >> >> > > > > > > > > >>> > wrote:
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >Hi Jay and Neha,
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >Thanks a lot for the 
>>reply
>> >>and
>> >> >> >> >> > explanation. I
>> >> >> >> >> > > > do
>> >> >> >> >> > > > > > agree
>> >> >> >> >> > > > > > > > > >>>it
>> >> >> >> >> > > > > > > > > >>> > >>makes
>> >> >> >> >> > > > > > > > > >>> > >> >>more
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>sense
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >to avoid duplicate effort
>> >>and
>> >> >>plan
>> >> >> >> >>based
>> >> >> >> >> > on
>> >> >> >> >> > > > new
>> >> >> >> >> > > > > > > > > >>>consumer.
>> >> >> >> >> > > > > > > > > >>> I’ll
>> >> >> >> >> > > > > > > > > >>> > >> >> >>modify
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>the
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >KIP.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >To Jay’s question on 
>>message
>> >> >> >> >>ordering -
>> >> >> >> >> > The
>> >> >> >> >> > > > data
>> >> >> >> >> > > > > > > > channel
>> >> >> >> >> > > > > > > > > >>> > >> >>selection
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>makes
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >sure that the messages 
>>from
>> >>the
>> >> >> >>same
>> >> >> >> >> > source
>> >> >> >> >> > > > > > partition
>> >> >> >> >> > > > > > > > > >>>will
>> >> >> >> >> > > > > > > > > >>> > >>sent
>> >> >> >> >> > > > > > > > > >>> > >> >>by
>> >> >> >> >> > > > > > > > > >>> > >> >> >>the
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >same producer. So the 
>>order
>> >>of
>> >> >>the
>> >> >> >> >> > messages is
>> >> >> >> >> > > > > > > > > >>>guaranteed
>> >> >> >> >> > > > > > > > > >>> with
>> >> >> >> >> > > > > > > > > >>> > >> >> >>proper
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >producer settings
>> >> >> >> >> > > > > > > > > >>> > >>
>> >> >> >> >>>>(MaxInFlightRequests=1,retries=Integer.MaxValue,
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>etc.)
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >For keyed messages, 
>>because
>> >> >>they
>> >> >> >>come
>> >> >> >> >> > from the
>> >> >> >> >> > > > > > same
>> >> >> >> >> > > > > > > > > >>>source
>> >> >> >> >> > > > > > > > > >>> > >> >>partition
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>and
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >will end up in the same
>> >>target
>> >> >> >> >> partition,
>> >> >> >> >> > as
>> >> >> >> >> > > > long
>> >> >> >> >> > > > > > as
>> >> >> >> >> > > > > > > > > >>>they
>> >> >> >> >> > > > > > > > > >>> are
>> >> >> >> >> > > > > > > > > >>> > >> >>sent
>> >> >> >> >> > > > > > > > > >>> > >> >> >>by
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>the
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >same producer, the order 
>>is
>> >> >> >> >>guaranteed.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >For non-keyed messages, 
>>the
>> >> >> >>messages
>> >> >> >> >> > coming
>> >> >> >> >> > > > from
>> >> >> >> >> > > > > > the
>> >> >> >> >> > > > > > > > > >>>same
>> >> >> >> >> > > > > > > > > >>> > >>source
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>partition
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >might go to different 
>>target
>> >> >> >> >>partitions.
>> >> >> >> >> > The
>> >> >> >> >> > > > > > order is
>> >> >> >> >> > > > > > > > > >>>only
>> >> >> >> >> > > > > > > > > >>> > >> >> >>guaranteed
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >within each partition.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >Anyway, I’ll modify the 
>>KIP
>> >>and
>> >> >> >>data
>> >> >> >> >> > channel
>> >> >> >> >> > > > will
>> >> >> >> >> > > > > > be
>> >> >> >> >> > > > > > > > > >>>away.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >Thanks.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >Jiangjie (Becket) Qin
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >On 1/25/15, 4:34 PM, 
>>"Neha
>> >> >> >>Narkhede"
>> >> >> >> >><
>> >> >> >> >> > > > > > > > n...@confluent.io>
>> >> >> >> >> > > > > > > > > >>> > >>wrote:
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>I think there is some
>> >>value in
>> >> >> >> >> > investigating
>> >> >> >> >> > > > if
>> >> >> >> >> > > > > > we
>> >> >> >> >> > > > > > > > can
>> >> >> >> >> > > > > > > > > >>>go
>> >> >> >> >> > > > > > > > > >>> > >>back
>> >> >> >> >> > > > > > > > > >>> > >> >>to
>> >> >> >> >> > > > > > > > > >>> > >> >> >>the
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>simple mirror maker
>> >>design, as
>> >> >> >>Jay
>> >> >> >> >> points
>> >> >> >> >> > > > out.
>> >> >> >> >> > > > > > Here
>> >> >> >> >> > > > > > > > you
>> >> >> >> >> > > > > > > > > >>> have
>> >> >> >> >> > > > > > > > > >>> > >>N
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>threads,
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>each has a consumer and 
>>a
>> >> >> >>producer.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>The reason why we had to
>> >>move
>> >> >> >>away
>> >> >> >> >>from
>> >> >> >> >> > that
>> >> >> >> >> > > > was
>> >> >> >> >> > > > > > a
>> >> >> >> >> > > > > > > > > >>> > >>combination
>> >> >> >> >> > > > > > > > > >>> > >> >>of
>> >> >> >> >> > > > > > > > > >>> > >> >> >>the
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>difference in throughput
>> >> >>between
>> >> >> >>the
>> >> >> >> >> > consumer
>> >> >> >> >> > > > > > and the
>> >> >> >> >> > > > > > > > > >>>old
>> >> >> >> >> > > > > > > > > >>> > >> >>producer
>> >> >> >> >> > > > > > > > > >>> > >> >> >>and
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>the
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>deficiency of the 
>>consumer
>> >> >> >> >>rebalancing
>> >> >> >> >> > that
>> >> >> >> >> > > > > > limits
>> >> >> >> >> > > > > > > > the
>> >> >> >> >> > > > > > > > > >>> total
>> >> >> >> >> > > > > > > > > >>> > >> >> >>number of
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>mirror maker threads. So
>> >>the
>> >> >>only
>> >> >> >> >> option
>> >> >> >> >> > > > > > available
>> >> >> >> >> > > > > > > > was
>> >> >> >> >> > > > > > > > > >>>to
>> >> >> >> >> > > > > > > > > >>> > >> >>increase
>> >> >> >> >> > > > > > > > > >>> > >> >> >>the
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>throughput of the 
>>limited
>> >># of
>> >> >> >> >>mirror
>> >> >> >> >> > maker
>> >> >> >> >> > > > > > threads
>> >> >> >> >> > > > > > > > > >>>that
>> >> >> >> >> > > > > > > > > >>> > >>could
>> >> >> >> >> > > > > > > > > >>> > >> >>be
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>deployed.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>Now that queuing design 
>>may
>> >> >>not
>> >> >> >>make
>> >> >> >> >> > sense,
>> >> >> >> >> > > > if
>> >> >> >> >> > > > > > the
>> >> >> >> >> > > > > > > > new
>> >> >> >> >> > > > > > > > > >>> > >> >>producer's
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>throughput is almost
>> >>similar
>> >> >>to
>> >> >> >>the
>> >> >> >> >> > consumer
>> >> >> >> >> > > > AND
>> >> >> >> >> > > > > > the
>> >> >> >> >> > > > > > > > > >>>fact
>> >> >> >> >> > > > > > > > > >>> > >>that
>> >> >> >> >> > > > > > > > > >>> > >> >>the
>> >> >> >> >> > > > > > > > > >>> > >> >> >>new
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>round-robin based 
>>consumer
>> >> >> >> >>rebalancing
>> >> >> >> >> > can
>> >> >> >> >> > > > allow
>> >> >> >> >> > > > > > a
>> >> >> >> >> > > > > > > > very
>> >> >> >> >> > > > > > > > > >>> high
>> >> >> >> >> > > > > > > > > >>> > >> >> >>number of
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>mirror maker instances 
>>to
>> >> >>exist.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>This is the end state 
>>that
>> >>the
>> >> >> >> >>mirror
>> >> >> >> >> > maker
>> >> >> >> >> > > > > > should be
>> >> >> >> >> > > > > > > > > >>>in
>> >> >> >> >> > > > > > > > > >>> once
>> >> >> >> >> > > > > > > > > >>> > >> >>the
>> >> >> >> >> > > > > > > > > >>> > >> >> >>new
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>consumer is complete, 
>>so it
>> >> >> >>wouldn't
>> >> >> >> >> > hurt to
>> >> >> >> >> > > > see
>> >> >> >> >> > > > > > if
>> >> >> >> >> > > > > > > > we
>> >> >> >> >> > > > > > > > > >>>can
>> >> >> >> >> > > > > > > > > >>> > >>just
>> >> >> >> >> > > > > > > > > >>> > >> >> >>move
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>to
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>that right now.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>On Fri, Jan 23, 2015 at
>> >>8:40
>> >> >>PM,
>> >> >> >>Jay
>> >> >> >> >> > Kreps
>> >> >> >> >> > > > > > > > > >>> > >><jay.kr...@gmail.com
>> >> >> >> >> > > > > > > > > >>> > >> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>wrote:
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> QQ: If we ever use a
>> >> >>different
>> >> >> >> >> > technique
>> >> >> >> >> > > > for
>> >> >> >> >> > > > > > the
>> >> >> >> >> > > > > > > > data
>> >> >> >> >> > > > > > > > > >>> > >>channel
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>selection
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> than for the producer
>> >> >> >>partitioning
>> >> >> >> >> > won't
>> >> >> >> >> > > > that
>> >> >> >> >> > > > > > break
>> >> >> >> >> > > > > > > > > >>> > >>ordering?
>> >> >> >> >> > > > > > > > > >>> > >> >>How
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>can
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>we
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> ensure these things 
>>stay
>> >>in
>> >> >> >>sync?
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> With respect to the 
>>new
>> >> >> >> >>consumer--I
>> >> >> >> >> > really
>> >> >> >> >> > > > do
>> >> >> >> >> > > > > > want
>> >> >> >> >> > > > > > > > to
>> >> >> >> >> > > > > > > > > >>> > >> >>encourage
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>people
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>to
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> think through how MM 
>>will
>> >> >>work
>> >> >> >> >>with
>> >> >> >> >> > the new
>> >> >> >> >> > > > > > > > consumer.
>> >> >> >> >> > > > > > > > > >>>I
>> >> >> >> >> > > > > > > > > >>> > >>mean
>> >> >> >> >> > > > > > > > > >>> > >> >>this
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>isn't
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> very far off, maybe a 
>>few
>> >> >> >>months
>> >> >> >> >>if
>> >> >> >> >> we
>> >> >> >> >> > > > hustle?
>> >> >> >> >> > > > > > I
>> >> >> >> >> > > > > > > > > >>>could
>> >> >> >> >> > > > > > > > > >>> > >> >>imagine us
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>getting
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> this mm fix done maybe
>> >> >>sooner,
>> >> >> >> >>maybe
>> >> >> >> >> > in a
>> >> >> >> >> > > > > > month?
>> >> >> >> >> > > > > > > > So I
>> >> >> >> >> > > > > > > > > >>> guess
>> >> >> >> >> > > > > > > > > >>> > >> >>this
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>buys
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>us an
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> extra month before we
>> >>rip it
>> >> >> >>out
>> >> >> >> >>and
>> >> >> >> >> > throw
>> >> >> >> >> > > > it
>> >> >> >> >> > > > > > away?
>> >> >> >> >> > > > > > > > > >>>Maybe
>> >> >> >> >> > > > > > > > > >>> > >>two?
>> >> >> >> >> > > > > > > > > >>> > >> >> >>This
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>bug
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>has
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> been there for a 
>>while,
>> >> >>though,
>> >> >> >> >> right?
>> >> >> >> >> > Is
>> >> >> >> >> > > > it
>> >> >> >> >> > > > > > worth
>> >> >> >> >> > > > > > > > > >>>it?
>> >> >> >> >> > > > > > > > > >>> > >> >>Probably
>> >> >> >> >> > > > > > > > > >>> > >> >> >>it
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>is,
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>but
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> it still kind of 
>>sucks to
>> >> >>have
>> >> >> >>the
>> >> >> >> >> > > > duplicate
>> >> >> >> >> > > > > > > > effort.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> So anyhow let's
>> >>definitely
>> >> >> >>think
>> >> >> >> >> about
>> >> >> >> >> > how
>> >> >> >> >> > > > > > things
>> >> >> >> >> > > > > > > > > >>>will
>> >> >> >> >> > > > > > > > > >>> work
>> >> >> >> >> > > > > > > > > >>> > >> >>with
>> >> >> >> >> > > > > > > > > >>> > >> >> >>the
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>new
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> consumer. I think we 
>>can
>> >> >> >>probably
>> >> >> >> >> just
>> >> >> >> >> > > > have N
>> >> >> >> >> > > > > > > > > >>>threads,
>> >> >> >> >> > > > > > > > > >>> each
>> >> >> >> >> > > > > > > > > >>> > >> >> >>thread
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>has
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>a
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> producer and consumer
>> >>and is
>> >> >> >> >> internally
>> >> >> >> >> > > > single
>> >> >> >> >> > > > > > > > > >>>threaded.
>> >> >> >> >> > > > > > > > > >>> > >>Any
>> >> >> >> >> > > > > > > > > >>> > >> >> >>reason
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>this
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> wouldn't work?
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> -Jay
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> On Wed, Jan 21, 2015 
>>at
>> >>5:29
>> >> >> >>PM,
>> >> >> >> >> > Jiangjie
>> >> >> >> >> > > > Qin
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>
>> >>>>><j...@linkedin.com.invalid>
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> wrote:
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > Hi Jay,
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > Thanks for comments.
>> >> >>Please
>> >> >> >>see
>> >> >> >> >> > inline
>> >> >> >> >> > > > > > responses.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > Jiangjie (Becket) 
>>Qin
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > On 1/21/15, 1:33 PM,
>> >>"Jay
>> >> >> >>Kreps"
>> >> >> >> >> > > > > > > > > >>><jay.kr...@gmail.com>
>> >> >> >> >> > > > > > > > > >>> > >> >>wrote:
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >Hey guys,
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >A couple
>> >> >>questions/comments:
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >1. The callback and
>> >> >> >> >> user-controlled
>> >> >> >> >> > > > commit
>> >> >> >> >> > > > > > > > offset
>> >> >> >> >> > > > > > > > > >>> > >> >> >>functionality
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>is
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> already
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >in the new consumer
>> >> >>which we
>> >> >> >> >>are
>> >> >> >> >> > > > working on
>> >> >> >> >> > > > > > in
>> >> >> >> >> > > > > > > > > >>> parallel.
>> >> >> >> >> > > > > > > > > >>> > >> >>If we
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> accelerated
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >that work it might
>> >>help
>> >> >> >> >> concentrate
>> >> >> >> >> > > > > > efforts. I
>> >> >> >> >> > > > > > > > > >>>admit
>> >> >> >> >> > > > > > > > > >>> > >>this
>> >> >> >> >> > > > > > > > > >>> > >> >> >>might
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>take
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >slightly longer in
>> >> >>calendar
>> >> >> >> >>time
>> >> >> >> >> but
>> >> >> >> >> > > > could
>> >> >> >> >> > > > > > still
>> >> >> >> >> > > > > > > > > >>> > >>probably
>> >> >> >> >> > > > > > > > > >>> > >> >>get
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>done
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>this
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >quarter. Have you 
>>guys
>> >> >> >> >>considered
>> >> >> >> >> > that
>> >> >> >> >> > > > > > approach?
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > Yes, I totally agree
>> >>that
>> >> >> >> >>ideally
>> >> >> >> >> we
>> >> >> >> >> > > > should
>> >> >> >> >> > > > > > put
>> >> >> >> >> > > > > > > > > >>>efforts
>> >> >> >> >> > > > > > > > > >>> > >>on
>> >> >> >> >> > > > > > > > > >>> > >> >>new
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>consumer.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > The main reason for
>> >>still
>> >> >> >> >>working
>> >> >> >> >> on
>> >> >> >> >> > the
>> >> >> >> >> > > > old
>> >> >> >> >> > > > > > > > > >>>consumer
>> >> >> >> >> > > > > > > > > >>> is
>> >> >> >> >> > > > > > > > > >>> > >> >>that
>> >> >> >> >> > > > > > > > > >>> > >> >> >>we
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>expect
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> it
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > would still be used 
>>in
>> >> >> >>LinkedIn
>> >> >> >> >>for
>> >> >> >> >> > > > quite a
>> >> >> >> >> > > > > > while
>> >> >> >> >> > > > > > > > > >>> before
>> >> >> >> >> > > > > > > > > >>> > >>the
>> >> >> >> >> > > > > > > > > >>> > >> >> >>new
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>consumer
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > could be fully 
>>rolled
>> >>out.
>> >> >> >>And
>> >> >> >> >>we
>> >> >> >> >> > > > recently
>> >> >> >> >> > > > > > > > > >>>suffering a
>> >> >> >> >> > > > > > > > > >>> > >>lot
>> >> >> >> >> > > > > > > > > >>> > >> >>from
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>mirror
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > maker data loss 
>>issue.
>> >>So
>> >> >>our
>> >> >> >> >> current
>> >> >> >> >> > > > plan is
>> >> >> >> >> > > > > > > > > >>>making
>> >> >> >> >> > > > > > > > > >>> > >> >>necessary
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>changes to
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > make current mirror
>> >>maker
>> >> >> >> >>stable in
>> >> >> >> >> > > > > > production.
>> >> >> >> >> > > > > > > > > >>>Then we
>> >> >> >> >> > > > > > > > > >>> > >>can
>> >> >> >> >> > > > > > > > > >>> > >> >> >>test
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>and
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > rollout new consumer
>> >> >> >>gradually
>> >> >> >> >> > without
>> >> >> >> >> > > > > > getting
>> >> >> >> >> > > > > > > > > >>>burnt.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >2. I think
>> >>partitioning
>> >> >>on
>> >> >> >>the
>> >> >> >> >> hash
>> >> >> >> >> > of
>> >> >> >> >> > > > the
>> >> >> >> >> > > > > > topic
>> >> >> >> >> > > > > > > > > >>> > >>partition
>> >> >> >> >> > > > > > > > > >>> > >> >>is
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>not a
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>very
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >good idea because 
>>that
>> >> >>will
>> >> >> >> >>make
>> >> >> >> >> the
>> >> >> >> >> > > > case of
>> >> >> >> >> > > > > > > > going
>> >> >> >> >> > > > > > > > > >>> from
>> >> >> >> >> > > > > > > > > >>> > >>a
>> >> >> >> >> > > > > > > > > >>> > >> >> >>cluster
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>with
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >fewer partitions to
>> >>one
>> >> >>with
>> >> >> >> >>more
>> >> >> >> >> > > > > > partitions not
>> >> >> >> >> > > > > > > > > >>> work. I
>> >> >> >> >> > > > > > > > > >>> > >> >> >>think an
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >intuitive
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >way to do this 
>>would
>> >>be
>> >> >>the
>> >> >> >> >> > following:
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >a. Default 
>>behavior:
>> >> >>Just do
>> >> >> >> >>what
>> >> >> >> >> > the
>> >> >> >> >> > > > > > producer
>> >> >> >> >> > > > > > > > > >>>does.
>> >> >> >> >> > > > > > > > > >>> > >>I.e.
>> >> >> >> >> > > > > > > > > >>> > >> >>if
>> >> >> >> >> > > > > > > > > >>> > >> >> >>you
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> specify a
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >key use it for
>> >> >> >>partitioning, if
>> >> >> >> >> not
>> >> >> >> >> > just
>> >> >> >> >> > > > > > > > partition
>> >> >> >> >> > > > > > > > > >>>in
>> >> >> >> >> > > > > > > > > >>> a
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>round-robin
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >fashion.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >b. Add a
>> >> >> >>--preserve-partition
>> >> >> >> >> option
>> >> >> >> >> > > > that
>> >> >> >> >> > > > > > will
>> >> >> >> >> > > > > > > > > >>> > >>explicitly
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>inherent
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>the
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >partition from the
>> >>source
>> >> >> >> >> > irrespective
>> >> >> >> >> > > > of
>> >> >> >> >> > > > > > > > whether
>> >> >> >> >> > > > > > > > > >>> there
>> >> >> >> >> > > > > > > > > >>> > >>is
>> >> >> >> >> > > > > > > > > >>> > >> >>a
>> >> >> >> >> > > > > > > > > >>> > >> >> >>key
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>or
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> which
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >partition that key
>> >>would
>> >> >> >>hash
>> >> >> >> >>to.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > Sorry that I did not
>> >> >>explain
>> >> >> >> >>this
>> >> >> >> >> > clear
>> >> >> >> >> > > > > > enough.
>> >> >> >> >> > > > > > > > The
>> >> >> >> >> > > > > > > > > >>> hash
>> >> >> >> >> > > > > > > > > >>> > >>of
>> >> >> >> >> > > > > > > > > >>> > >> >> >>topic
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > partition is only 
>>used
>> >> >>when
>> >> >> >> >>decide
>> >> >> >> >> > which
>> >> >> >> >> > > > > > mirror
>> >> >> >> >> > > > > > > > > >>>maker
>> >> >> >> >> > > > > > > > > >>> > >>data
>> >> >> >> >> > > > > > > > > >>> > >> >> >>channel
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>queue
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > the consumer thread
>> >>should
>> >> >> >>put
>> >> >> >> >> > message
>> >> >> >> >> > > > into.
>> >> >> >> >> > > > > > It
>> >> >> >> >> > > > > > > > > >>>only
>> >> >> >> >> > > > > > > > > >>> > >>tries
>> >> >> >> >> > > > > > > > > >>> > >> >>to
>> >> >> >> >> > > > > > > > > >>> > >> >> >>make
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>sure
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > the messages from 
>>the
>> >>same
>> >> >> >> >> partition
>> >> >> >> >> > is
>> >> >> >> >> > > > sent
>> >> >> >> >> > > > > > by
>> >> >> >> >> > > > > > > > the
>> >> >> >> >> > > > > > > > > >>> same
>> >> >> >> >> > > > > > > > > >>> > >> >> >>producer
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>thread
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > to guarantee the
>> >>sending
>> >> >> >>order.
>> >> >> >> >> This
>> >> >> >> >> > is
>> >> >> >> >> > > > not
>> >> >> >> >> > > > > > at
>> >> >> >> >> > > > > > > > all
>> >> >> >> >> > > > > > > > > >>> > >>related
>> >> >> >> >> > > > > > > > > >>> > >> >>to
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>which
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > partition in target
>> >> >>cluster
>> >> >> >>the
>> >> >> >> >> > messages
>> >> >> >> >> > > > end
>> >> >> >> >> > > > > > up.
>> >> >> >> >> > > > > > > > > >>>That
>> >> >> >> >> > > > > > > > > >>> is
>> >> >> >> >> > > > > > > > > >>> > >> >>still
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>decided by
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > producer.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >3. You don't 
>>actually
>> >> >>give
>> >> >> >>the
>> >> >> >> >> > > > > > > > > >>> ConsumerRebalanceListener
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>interface.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>What
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >is
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >that going to look
>> >>like?
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > Good point! I should
>> >>have
>> >> >>put
>> >> >> >> >>it in
>> >> >> >> >> > the
>> >> >> >> >> > > > > > wiki. I
>> >> >> >> >> > > > > > > > > >>>just
>> >> >> >> >> > > > > > > > > >>> > >>added
>> >> >> >> >> > > > > > > > > >>> > >> >>it.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >4. What is
>> >> >> >>MirrorMakerRecord? I
>> >> >> >> >> > think
>> >> >> >> >> > > > > > ideally
>> >> >> >> >> > > > > > > > the
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> >
>> >>>MirrorMakerMessageHandler
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >interface would 
>>take a
>> >> >> >> >> > ConsumerRecord as
>> >> >> >> >> > > > > > input
>> >> >> >> >> > > > > > > > and
>> >> >> >> >> > > > > > > > > >>> > >>return a
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >ProducerRecord,
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >right? That would
>> >>allow
>> >> >>you
>> >> >> >>to
>> >> >> >> >> > > > transform the
>> >> >> >> >> > > > > > > > key,
>> >> >> >> >> > > > > > > > > >>> value,
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>partition,
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>or
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >destination 
>>topic...
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > MirrorMakerRecord is
>> >> >> >>introduced
>> >> >> >> >>in
>> >> >> >> >> > > > > > KAFKA-1650,
>> >> >> >> >> > > > > > > > > >>>which is
>> >> >> >> >> > > > > > > > > >>> > >> >>exactly
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>the
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>same
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > as ConsumerRecord in
>> >> >> >>KAFKA-1760.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > private[kafka] class
>> >> >> >> >> > MirrorMakerRecord
>> >> >> >> >> > > > (val
>> >> >> >> >> > > > > > > > > >>> sourceTopic:
>> >> >> >> >> > > > > > > > > >>> > >> >> >>String,
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> >   val 
>>sourcePartition:
>> >> >>Int,
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> >   val sourceOffset:
>> >>Long,
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> >   val key: 
>>Array[Byte],
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> >   val value:
>> >>Array[Byte])
>> >> >>{
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> >   def size =
>> >>value.length
>> >> >>+
>> >> >> >>{if
>> >> >> >> >> (key
>> >> >> >> >> > ==
>> >> >> >> >> > > > > > null) 0
>> >> >> >> >> > > > > > > > > >>>else
>> >> >> >> >> > > > > > > > > >>> > >> >> >>key.length}
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > }
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > However, because 
>>source
>> >> >> >> >>partition
>> >> >> >> >> and
>> >> >> >> >> > > > offset
>> >> >> >> >> > > > > > is
>> >> >> >> >> > > > > > > > > >>>needed
>> >> >> >> >> > > > > > > > > >>> in
>> >> >> >> >> > > > > > > > > >>> > >> >> >>producer
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>thread
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > for consumer offsets
>> >> >> >> >>bookkeeping,
>> >> >> >> >> the
>> >> >> >> >> > > > record
>> >> >> >> >> > > > > > > > > >>>returned
>> >> >> >> >> > > > > > > > > >>> by
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> >
>> >>MirrorMakerMessageHandler
>> >> >> >>needs
>> >> >> >> >>to
>> >> >> >> >> > > > contain
>> >> >> >> >> > > > > > those
>> >> >> >> >> > > > > > > > > >>> > >> >>information.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>Therefore
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > ProducerRecord does 
>>not
>> >> >>work
>> >> >> >> >>here.
>> >> >> >> >> We
>> >> >> >> >> > > > could
>> >> >> >> >> > > > > > > > > >>>probably
>> >> >> >> >> > > > > > > > > >>> let
>> >> >> >> >> > > > > > > > > >>> > >> >> >>message
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>handler
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > take ConsumerRecord 
>>for
>> >> >>both
>> >> >> >> >>input
>> >> >> >> >> > and
>> >> >> >> >> > > > > > output.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >5. Have you guys
>> >>thought
>> >> >> >>about
>> >> >> >> >> what
>> >> >> >> >> > the
>> >> >> >> >> > > > > > > > > >>>implementation
>> >> >> >> >> > > > > > > > > >>> > >>will
>> >> >> >> >> > > > > > > > > >>> > >> >> >>look
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>like in
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >terms of threading
>> >> >> >>architecture
>> >> >> >> >> etc
>> >> >> >> >> > with
>> >> >> >> >> > > > > > the new
>> >> >> >> >> > > > > > > > > >>> > >>consumer?
>> >> >> >> >> > > > > > > > > >>> > >> >> >>That
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>will
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>be
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >soon so even if we
>> >>aren't
>> >> >> >> >>starting
>> >> >> >> >> > with
>> >> >> >> >> > > > that
>> >> >> >> >> > > > > > > > let's
>> >> >> >> >> > > > > > > > > >>> make
>> >> >> >> >> > > > > > > > > >>> > >> >>sure
>> >> >> >> >> > > > > > > > > >>> > >> >> >>we
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>can
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>get
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >rid
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >of a lot of the
>> >>current
>> >> >> >>mirror
>> >> >> >> >> maker
>> >> >> >> >> > > > > > accidental
>> >> >> >> >> > > > > > > > > >>> > >>complexity
>> >> >> >> >> > > > > > > > > >>> > >> >>in
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>terms
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>of
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >threads and queues
>> >>when
>> >> >>we
>> >> >> >> >>move to
>> >> >> >> >> > that.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > I haven¹t thought
>> >>about it
>> >> >> >> >> > throughly. The
>> >> >> >> >> > > > > > quick
>> >> >> >> >> > > > > > > > > >>>idea is
>> >> >> >> >> > > > > > > > > >>> > >> >>after
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>migration
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> to
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > the new consumer, 
>>it is
>> >> >> >>probably
>> >> >> >> >> > better
>> >> >> >> >> > > > to
>> >> >> >> >> > > > > > use a
>> >> >> >> >> > > > > > > > > >>>single
>> >> >> >> >> > > > > > > > > >>> > >> >> >>consumer
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>thread.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > If multithread is
>> >>needed,
>> >> >> >> >> decoupling
>> >> >> >> >> > > > > > consumption
>> >> >> >> >> > > > > > > > > >>>and
>> >> >> >> >> > > > > > > > > >>> > >> >>processing
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>might
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>be
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > used. MirrorMaker
>> >> >>definitely
>> >> >> >> >>needs
>> >> >> >> >> > to be
>> >> >> >> >> > > > > > changed
>> >> >> >> >> > > > > > > > > >>>after
>> >> >> >> >> > > > > > > > > >>> > >>new
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>consumer
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>get
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > checked in. I¹ll
>> >>document
>> >> >>the
>> >> >> >> >> changes
>> >> >> >> >> > > > and can
>> >> >> >> >> > > > > > > > > >>>submit
>> >> >> >> >> > > > > > > > > >>> > >>follow
>> >> >> >> >> > > > > > > > > >>> > >> >>up
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>patches
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > after the new 
>>consumer
>> >>is
>> >> >> >> >> available.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >-Jay
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >On Tue, Jan 20, 
>>2015
>> >>at
>> >> >>4:31
>> >> >> >> >>PM,
>> >> >> >> >> > > > Jiangjie
>> >> >> >> >> > > > > > Qin
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> 
>>>>><j...@linkedin.com.invalid
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >wrote:
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> Hi Kafka Devs,
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >>
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> We are working on
>> >>Kafka
>> >> >> >> >>Mirror
>> >> >> >> >> > Maker
>> >> >> >> >> > > > > > > > > >>>enhancement. A
>> >> >> >> >> > > > > > > > > >>> > >>KIP
>> >> >> >> >> > > > > > > > > >>> > >> >>is
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>posted
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>to
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> document and
>> >>discuss on
>> >> >> >>the
>> >> >> >> >> > > > followings:
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> 1. KAFKA-1650: No
>> >>Data
>> >> >> >>loss
>> >> >> >> >> mirror
>> >> >> >> >> > > > maker
>> >> >> >> >> > > > > > > > change
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> 2. KAFKA-1839: To
>> >>allow
>> >> >> >> >> partition
>> >> >> >> >> > > > aware
>> >> >> >> >> > > > > > > > mirror.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> 3. KAFKA-1840: To
>> >>allow
>> >> >> >> >>message
>> >> >> >> >> > > > > > > > filtering/format
>> >> >> >> >> > > > > > > > > >>> > >> >>conversion
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> Feedbacks are
>> >>welcome.
>> >> >> >>Please
>> >> >> >> >> let
>> >> >> >> >> > us
>> >> >> >> >> > > > know
>> >> >> >> >> > > > > > if
>> >> >> >> >> > > > > > > > you
>> >> >> >> >> > > > > > > > > >>> have
>> >> >> >> >> > > > > > > > > >>> > >>any
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>questions or
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> concerns.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >>
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> Thanks.
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >>
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> Jiangjie (Becket)
>> >>Qin
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >>
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>--
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>Thanks,
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>Neha
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>
>> >> >> >> >> > > > > > > > > >>> > >> >> >> >>
>> >> >> >> >> > > > > > > > > >>> > >> >> >>
>> >> >> >> >> > > > > > > > > >>> > >> >> >>
>> >> >> >> >> > > > > > > > > >>> > >> >>
>> >> >> >> >> > > > > > > > > >>> > >> >>
>> >> >> >> >> > > > > > > > > >>> > >>
>> >> >> >> >> > > > > > > > > >>> > >>
>> >> >> >> >> > > > > > > > > >>> > >
>> >> >> >> >> > > > > > > > > >>> > >
>> >> >> >> >> > > > > > > > > >>> > >--
>> >> >> >> >> > > > > > > > > >>> > >Thanks,
>> >> >> >> >> > > > > > > > > >>> > >Neha
>> >> >> >> >> > > > > > > > > >>> >
>> >> >> >> >> > > > > > > > > >>> >
>> >> >> >> >> > > > > > > > > >>>
>> >> >> >> >> > > > > > > > > >>>
>> >> >> >> >> > > > > > > > > >>> --
>> >> >> >> >> > > > > > > > > >>> Thanks,
>> >> >> >> >> > > > > > > > > >>> Neha
>> >> >> >> >> > > > > > > > > >>>
>> >> >> >> >> > > > > > > > > >
>> >> >> >> >> > > > > > > > >
>> >> >> >> >> > > > > > > >
>> >> >> >> >> > > > > > > >
>> >> >> >> >> > > > > > >
>> >> >> >> >> > > > > > >
>> >> >> >> >> > > > > > > --
>> >> >> >> >> > > > > > > Thanks,
>> >> >> >> >> > > > > > > Neha
>> >> >> >> >> > > > > >
>> >> >> >> >> > > > > >
>> >> >> >> >> > > >
>> >> >> >> >> > > >
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >>
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >--
>> >> >> >> >Thanks,
>> >> >> >> >Neha
>> >> >> >>
>> >> >> >>
>> >> >>
>> >> >>
>> >>
>> >>
>> >
>> >
>> >--
>> >Thanks,
>> >Neha
>>
>>

Reply via email to