Hi Michael,

That whilst sending tombstone and non null value, the consumer can expect
only to receive the non-null message only in step (3) is this correct?
---> I do agree with you here.

Becket, Ismael : can you guys review the migration plan listed above using
magic byte?

Thanks,

Mayuresh

On Fri, Nov 18, 2016 at 8:58 AM, Michael Pearce <michael.pea...@ig.com>
wrote:

> Many thanks for this Mayuresh. I don't have any objections.
>
> I assume we should state:
>
> That whilst sending tombstone and non null value, the consumer can expect
> only to receive the non-null message only in step (3) is this correct?
>
> Cheers
> Mike
>
>
>
> Sent using OWA for iPhone
> ________________________________________
> From: Mayuresh Gharat <gharatmayures...@gmail.com>
> Sent: Thursday, November 17, 2016 5:18:41 PM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
>
> Hi Ismael,
>
> Thanks for the explanation.
> Specially I like this part where in you mentioned we can get rid of the
> older null value support for log compaction later on, here :
> We can't change semantics of the message format without having a long
> transition period. And we can't rely
> on people reading documentation or acting on a warning for something so
> fundamental. As such, my take is that we need to bump the magic byte. The
> good news is
> that we don't have to support all versions forever. We have said that we
> will support direct upgrades for 2 years. That means that message format
> version n could, in theory, be removed 2 years after the it's introduced.
>
> Just a heads up, I would like to mention that even without bumping magic
> byte, we will *NOT* loose zero copy as in the client(x+1) in my explanation
> above will convert internally a null value to have a tombstone bit set and
> a tombstone bit set to have a null value automatically internally and by
> the time we move to version (x+2), the clients would have upgraded.
> Obviously if we support a request from consumer(x), we will loose zero copy
> but that is the same case with magic byte.
>
> But if magic byte bump makes life easier for transition for the above
> reasons that you explained, I am OK with it since we are going to meet the
> end goal down the road :)
>
> On a side note can we update the doc here on magic byte to say that "*it
> should be bumped whenever the message format is changed or the
> interpretation of message format (usage of the reserved bits as well) is
> changed*".
>
>
> Hi Michael,
>
> Here is the update plan that we discussed offline yesterday :
>
> Currently the magic-byte which corresponds to the "message.format.version"
> is set to 1.
>
> 1) On broker it will be set to 1 initially.
>
> 2) When a producer client sends a message with magic-byte = 2, since the
> broker is on magic-byte = 1, we will down convert it, which means if the
> tombstone bit is set, the value will be set to null. A consumer
> understanding magic-byte = 1, will still work with this. A consumer working
> with magic-byte =2 will also be able to understand this, since it
> understands the tombstone.
> Now there is still the question of supporting a non-tombstone and null
> value from producer client with magic-byte = 2.* (I am not sure if we
> should support this. Ismael/Becket can comment here)*
>
> 3) When almost all the clients have upgraded, the message.format.version on
> the broker can be changed to 2, where in the down conversion in the above
> step will not happen. If at this point we get a consumer request from a
> older consumer, we might have to down convert where in we loose zero copy,
> but these cases should be rare.
>
> Becket can you review this plan and add more details if I have
> missed/wronged something, before we put it on KIP.
>
> Thanks,
>
> Mayuresh
>
> On Wed, Nov 16, 2016 at 11:07 PM, Michael Pearce <michael.pea...@ig.com>
> wrote:
>
> > Thanks guys, for discussing this offline and getting some consensus.
> >
> > So its clear for myself and others what is proposed now (i think i
> > understand, but want to make sure)
> >
> > Could i ask either directly update the kip to detail the migration
> > strategy, or (re-)state your offline discussed and agreed migration
> > strategy based on a magic byte is in this thread.
> >
> >
> > The main original driver for the KIP was to support compaction where
> value
> > isn't null, based off the discussions on KIP-82 thread.
> >
> > We should be able to support non-tombstone + null value by the completion
> > of the KIP, as we noted when discussing this kip, having logic based on a
> > null value isn't very clean and also separates the concerns.
> >
> > As discussed already though we can split this into KIP-87a and KIP-87b
> >
> > Where we look to deliver KIP-87a on a compacted topic (to address the
> > immediate issues)
> > * tombstone + null value
> > * tombstone + non-null value
> > * non-tombstone + non-null value
> >
> > Then we can discuss once KIP-87a is completed options later and how we
> > support the second part KIP-87b to deliver:
> > * non-tombstone + null value
> >
> > Cheers
> > Mike
> >
> >
> >
> > ________________________________________
> > From: Becket Qin <becket....@gmail.com>
> > Sent: Thursday, November 17, 2016 1:43 AM
> > To: dev@kafka.apache.org
> > Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
> >
> > Renu, Mayuresh and I had an offline discussion, and following is a brief
> > summary.
> >
> > 1. We agreed that not bumping up magic value may result in losing zero
> copy
> > during migration.
> > 2. Given that bumping up magic value is almost free and has benefit of
> > avoiding potential performance issue. It is probably worth doing.
> >
> > One issue we still need to think about is whether we want to support a
> > non-tombstone message with null value.
> > Currently it is not supported by Kafka. If we allow a non-tombstone null
> > value message to exist after KIP-87. The problem is that such message
> will
> > not be supported by the consumers prior to KIP-87. Because a null value
> > will always be interpreted to a tombstone.
> >
> > One option is that we keep the current way, i.e. do not support such
> > message. It would be good to know if there is a concrete use case for
> such
> > message. If there is not, we can probably just not support it.
> >
> > Thanks,
> >
> > JIangjie (Becket) Qin
> >
> >
> >
> > On Wed, Nov 16, 2016 at 1:28 PM, Mayuresh Gharat <
> > gharatmayures...@gmail.com
> > > wrote:
> >
> > > Hi Ismael,
> > >
> > > This is something I can think of for migration plan:
> > > So the migration plan can look something like this, with up conversion
> :
> > >
> > > 1) Currently lets say we have Broker at version x.
> > > 2) Currently we have clients at version x.
> > > 3) a) We move the version to Broker(x+1) : supports both tombstone and
> > null
> > > for log compaction.
> > >     b) We upgrade the client to version client(x+1) : if in the
> producer
> > > client(x+1) the value is set to null, we will automatically set the
> > > Tombstone bit internally. If the producer client(x+1) sets the
> tombstone
> > > itself, well and good. For producer client(x), the broker will up
> convert
> > > to have the tombstone bit. Broker(x+1) is supporting both. Consumer
> > > client(x+1) will be aware of this and should be able to handle this.
> For
> > > consumer client(x) we will down convert the message on the broker side.
> > >     c) At this point we will have to specify a warning or clearly
> specify
> > > in docs that this behavior is about to be changed for log compaction.
> > > 4) a) In next release of the Broker(x+2), we say that only Tombstone is
> > > used for log compaction on the Broker side. Clients(x+1) still is
> > > supported.
> > >     b) We upgrade the client to version client(x+2) : if value is set
> to
> > > null, tombstone will not be set automatically. The client will have to
> > call
> > > setTombstone() to actually set the tombstone.
> > >
> > > We should compare this migration plan with the migration plan for magic
> > > byte bump and do whatever looks good.
> > > I am just worried that if we go down magic byte route, unless I am
> > missing
> > > something, it sounds like kafka will be stuck with supporting both null
> > > value and tombstone bit for log compaction for life long, which does
> not
> > > look like a good end state.
> > >
> > > Thanks,
> > >
> > > Mayuresh
> > >
> > >
> > >
> > >
> > > On Wed, Nov 16, 2016 at 9:32 AM, Mayuresh Gharat <
> > > gharatmayures...@gmail.com
> > > > wrote:
> > >
> > > > Hi Ismael,
> > > >
> > > > That's a very good point which I might have not considered earlier.
> > > >
> > > > Here is a plan that I can think of:
> > > >
> > > > Stage 1) The broker from now on, up converts the message to have the
> > > > tombstone marker. The log compaction thread does log compaction based
> > on
> > > > both null and tombstone marker. This is our transition period.
> > > > Stage 2) The next release we only say that log compaction is based on
> > > > tombstone marker. (Open source kafka makes this as a policy). By this
> > > time,
> > > > the organization which is moving to this release will be sure that
> they
> > > > have gone through the entire transition period.
> > > >
> > > > My only goal of doing this is that Kafka clearly specifies the end
> > state
> > > > about what log compaction means (is it null value or a tombstone
> > marker,
> > > > but not both).
> > > >
> > > > What do you think?
> > > >
> > > > Thanks,
> > > >
> > > > Mayuresh
> > > > .
> > > >
> > > > On Wed, Nov 16, 2016 at 9:17 AM, Ismael Juma <ism...@juma.me.uk>
> > wrote:
> > > >
> > > >> One comment below.
> > > >>
> > > >> On Wed, Nov 16, 2016 at 5:08 PM, Mayuresh Gharat <
> > > >> gharatmayures...@gmail.com
> > > >> > wrote:
> > > >>
> > > >> >    - If we don't bump up the magic byte, on the broker side, the
> > > broker
> > > >> >    will always have to look at both tombstone bit and the value
> when
> > > do
> > > >> the
> > > >> >    compaction. Assuming we do not bump up the magic byte,
> > > >> >    imagine the broker sees a message which does not have a
> tombstone
> > > bit
> > > >> >    set. The broker does not know when the message was produced
> (i.e.
> > > >> > whether
> > > >> >    the message has been up converted or not), it has to take a
> > further
> > > >> > look at
> > > >> >    the value to see if it is null or not in order to determine if
> it
> > > is
> > > >> a
> > > >> >    tombstone. The same logic has to be put on the consumer as well
> > > >> because
> > > >> > the
> > > >> >    consumer does not know if the message has been up converted or
> > not.
> > > >> >       - If we upconvert while appending, this is not the case,
> > right?
> > > >>
> > > >>
> > > >> If I understand you correctly, this is not sufficient because the
> log
> > > may
> > > >> have messages appended before it was upgraded to include KIP-87.
> > > >>
> > > >> Ismael
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > -Regards,
> > > > Mayuresh R. Gharat
> > > > (862) 250-7125
> > > >
> > >
> > >
> > >
> > > --
> > > -Regards,
> > > Mayuresh R. Gharat
> > > (862) 250-7125
> > >
> > The information contained in this email is strictly confidential and for
> > the use of the addressee only, unless otherwise indicated. If you are not
> > the intended recipient, please do not read, copy, use or disclose to
> others
> > this message or any attachment. Please also notify the sender by replying
> > to this email or by telephone (+44(020 7896 0011) and then delete the
> email
> > and any copies of it. Opinions, conclusion (etc) that do not relate to
> the
> > official business of this company shall be understood as neither given
> nor
> > endorsed by it. IG is a trading name of IG Markets Limited (a company
> > registered in England and Wales, company number 04008957) and IG Index
> > Limited (a company registered in England and Wales, company number
> > 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill,
> > London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG
> > Index Limited (register number 114059) are authorised and regulated by
> the
> > Financial Conduct Authority.
> >
>
>
>
> --
> -Regards,
> Mayuresh R. Gharat
> (862) 250-7125
> The information contained in this email is strictly confidential and for
> the use of the addressee only, unless otherwise indicated. If you are not
> the intended recipient, please do not read, copy, use or disclose to others
> this message or any attachment. Please also notify the sender by replying
> to this email or by telephone (+44(020 7896 0011) and then delete the email
> and any copies of it. Opinions, conclusion (etc) that do not relate to the
> official business of this company shall be understood as neither given nor
> endorsed by it. IG is a trading name of IG Markets Limited (a company
> registered in England and Wales, company number 04008957) and IG Index
> Limited (a company registered in England and Wales, company number
> 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill,
> London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG
> Index Limited (register number 114059) are authorised and regulated by the
> Financial Conduct Authority.
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125

Reply via email to