Hi Michael,

Do you mean using a new configuration it is just the exiting
message.format.version config? It seems the message.format.version config
is enough in this case. And the default value would always be the latest
version.

> Message version migration would be handled as like in KIP-32

Also just want to confirm on this. Today if an old consumer consumes a log
compacted topic and sees an empty value, it knows that is a tombstone.
After we start to use the attribute bit, a tombstone message can have a
non-empty value. So by "like in KIP-32" you mean we will remove the value
to down convert the message if the consumer version is old, right?

Thanks.

Jiangjie (Becket) Qin

On Wed, Nov 2, 2016 at 1:37 AM, Michael Pearce <michael.pea...@ig.com>
wrote:

> Hi Joel , et al.
>
> Any comments on the below idea to handle roll out / compatibility of this
> feature, using a configuration?
>
> Does it make sense/clear?
> Does it add value?
> Do we want to enforce flag by default, or value by default, or both?
>
> Cheers
> Mike
>
>
> On 10/27/16, 4:47 PM, "Michael Pearce" <michael.pea...@ig.com> wrote:
>
>     Thanks, James, I think this is a really good addition to the KIP
> details, please feel free to amend the wiki/add the use cases, also if any
> others you think of. I definitely think its worthwhile documenting. If you
> can’t let me know ill add them next week (just leaving for a long weekend
> off)
>
>     Re Joel and others comments about upgrade and compatibility.
>
>     Rather than trying to auto manage this.
>
>     Actually maybe we make a configuration option, both at server and per
> topic level to control the behavior of how the server logic should work out
> if the record, is a tombstone record .
>
>     e.g.
>
>     key = compation.tombstone.marker
>
>     value options:
>
>     value   (continues to use null value as tombstone marker)
>     flag (expects to use the tombstone flag)
>     value_or_flag (if either is true it treats the record as a tombstone)
>
>     This way on upgrade users can keep current behavior, and slowly
> migrate to the new. Having a transition period of using value_or_flag,
> finally having flag only if an organization wishes to use null values
> without it being treated as a tombstone marker (use case noted below)
>
>     Having it both global broker level and topic override also allows some
> flexibility here.
>
>     Cheers
>     Mike
>
>
>
>
>
>
>     On 10/27/16, 8:03 AM, "James Cheng" <wushuja...@gmail.com> wrote:
>
>         This KIP would definitely address a gap in the current
> functionality, where you currently can't have a tombstone with any
> associated content.
>
>         That said, I'd like to talk about use cases, to make sure that
> this is in fact useful. The KIP should be updated with whatever use cases
> we come up with.
>
>         First of all, an observation: When we speak about log compaction,
> we typically think of "the latest message for a key is retained". In that
> respect, a delete tombstone (i.e. a message with a null payload) is treated
> the same as any other Kafka message: the latest message is retained. It
> doesn't matter whether the latest message is null, or if the latest message
> has actual content. In all cases, the last message is retained.
>
>         The only way a delete tombstone is treated differently from other
> Kafka messages is that it automatically disappears after a while. The time
> of deletion is specified using delete.retention.ms.
>
>         So what we're really talking about is, do we want to support
> messages in a log-compacted topic that auto-delete themselves after a while?
>
>         In a thread from 2015, there was a discussion on first-class
> support of headers between Roger Hoover, Felix GV, Jun Rao, and I. See
> thread at https://groups.google.com/d/msg/confluent-platform/
> 8xPbjyUE_7E/yQ1AeCufL_gJ <https://groups.google.com/d/
> msg/confluent-platform/8xPbjyUE_7E/yQ1AeCufL_gJ> . In that thread, Jun
> raised a good question that I didn't have a good answer for at the time: If
> a message is going to auto-delete itself after a while, how important was
> the message? That is, what information did the message contain that was
> important *for a while* but not so important that it needed to be kept
> around forever?
>
>         Some use cases that I can think of:
>
>         1) Tracability. I would like to know who issued this delete
> tombstone. It might include the hostname, IP of the producer of the delete.
>         2) Timestamps. I would like to know when this delete was issued.
> This use case is already addressed by the availability of per-message
> timestamps that came in 0.10.0
>         3) Data provenance. I hope I'm using this phrase correctly, but
> what I mean is, where did this delete come from? What processing job
> emitted it? What input to the processing job caused this delete to be
> produced? For example, if a record in topic A was processed and caused a
> delete tombstone to be emitted to topic B, I might like the offset of the
> topic A message to be attached to the topic B message.
>         4) Distributed tracing for stream topologies. This might be a
> slight repeat of the above use cases. In the microservices world, we can
> generate call-graphs of webservices using tools like Zipkin/opentracing.io
> <http://opentracing.io/>, or something homegrown like
> https://engineering.linkedin.com/distributed-service-call-
> graph/real-time-distributed-tracing-website-performance-and-efficiency <
> https://engineering.linkedin.com/distributed-service-call-
> graph/real-time-distributed-tracing-website-performance-and-efficiency>.
> I can imagine that you might want to do something similar for stream
> processing topologies, where stream processing jobs carry along and forward
> along a globally unique identifier, and a distributed topology graph is
> generated.
>         5) Cases where processing a delete requires data that is not
> available in the message key. I'm not sure I have a good example of this,
> though. One hand-wavy example might be where I am publishing documents into
> Kafka where the documentId is the message key, and the text contents of the
> document are in the message body. And I have a consuming job that does some
> analytics on the message body. If that document gets deleted, then the
> consuming job might need the original message body in order to "delete"
> that message's impact from the analytics. But I'm not sure that is a great
> example. If the consumer was worried about that, the consumer would
> probably keep the original message around, stored by primary key. And then
> all it would need from a delete message would be the primary key of the
> message.
>
>         Do people think these are valid use cases?
>
>         What are other use cases that people can think of?
>
>         -James
>
>         > On Oct 26, 2016, at 3:46 PM, Mayuresh Gharat <
> gharatmayures...@gmail.com> wrote:
>         >
>         > +1 @Joel.
>         > I think a clear migration plan of upgrading and downgrading of
> server and
>         > clients along with handling of issues that Joel mentioned, on
> the KIP would
>         > be really great.
>         >
>         > Thanks,
>         >
>         > Mayuresh
>         >
>         > On Wed, Oct 26, 2016 at 3:31 PM, Joel Koshy <jjkosh...@gmail.com>
> wrote:
>         >
>         >> I'm not sure why it would be useful, but it should be
> theoretically
>         >> possible if the attribute bit alone is enough to mark a
> tombstone. OTOH, we
>         >> could consider that as invalid if we wish. These are relevant
> details that
>         >> I think should be added to the KIP.
>         >>
>         >> Also, in the few odd scenarios that I mentioned we should also
> consider
>         >> that fetches could be coming from other yet-to-be-upgraded
> brokers in a
>         >> cluster that is being upgraded. So we would probably want to
> continue to
>         >> support nulls as tombstones or down-convert in a way that we
> are sure works
>         >> with least surprise to fetchers.
>         >>
>         >> There is a slightly vague statement under "Compatibility,
> Deprecation, and
>         >> Migration Plan" that could benefit more details: *Logic would
> base on
>         >> current behavior of null value or if tombstone flag set to
> true, as such
>         >> wouldn't impact any existing flows simply allow new producers
> to make use
>         >> of the feature*. It is unclear to me based on that whether you
> would
>         >> interpret null as a tombstone if the tombstone attribute bit is
> off.
>         >>
>         >> On Wed, Oct 26, 2016 at 3:10 PM, Xavier Léauté <
> xav...@confluent.io>
>         >> wrote:
>         >>
>         >>> Does this mean that starting with V4 requests we would allow
> storing null
>         >>> messages in compacted topics? The KIP should probably clarify
> the
>         >> behavior
>         >>> for null messages where the tombstone flag is not net.
>         >>>
>         >>> On Wed, Oct 26, 2016 at 1:32 AM Magnus Edenhill <
> mag...@edenhill.se>
>         >>> wrote:
>         >>>
>         >>>> 2016-10-25 21:36 GMT+02:00 Nacho Solis
> <nso...@linkedin.com.invalid>:
>         >>>>
>         >>>>> I think you probably require a MagicByte bump if you expect
> correct
>         >>>>> behavior of the system as a whole.
>         >>>>>
>         >>>>> From a client perspective you want to make sure that when you
>         >> deliver a
>         >>>>> message that the broker supports the feature you're expecting
>         >>>>> (compaction).  So, depending on the behavior of the broker on
>         >>>> encountering
>         >>>>> a previously undefined bit flag I would suggest making some
> change to
>         >>>> make
>         >>>>> certain that flag-based compaction is supported.  I'm going
> to guess
>         >>> that
>         >>>>> the MagicByte would do this.
>         >>>>>
>         >>>>
>         >>>> I dont believe this is needed since it is already attributed
> through
>         >> the
>         >>>> request's API version.
>         >>>>
>         >>>> Producer:
>         >>>> * if a client sends ProduceRequest V4 then attributes.bit5
> indicates a
>         >>>> tombstone
>         >>>> * if a clients sends ProduceRequest <V4 then attributes.bit5
> is
>         >> ignored
>         >>>> and value==null indicates a tombstone
>         >>>> * in both cases the on-disk messages are stored with
> attributes.bit5
>         >> (I
>         >>>> assume?)
>         >>>>
>         >>>> Consumer:
>         >>>> * if a clients sends FetchRequest V4 messages are
> sendfile():ed
>         >> directly
>         >>>> from disk (with attributes.bit5)
>         >>>> * if a client sends FetchRequest <V4 messages are slowpathed
> and
>         >>>> translated from attributes.bit5 to value=null as required.
>         >>>>
>         >>>>
>         >>>> That's my understanding anyway, please correct me if I'm
> wrong.
>         >>>>
>         >>>> /Magnus
>         >>>>
>         >>>>
>         >>>>
>         >>>>> On Tue, Oct 25, 2016 at 10:17 AM, Magnus Edenhill <
>         >> mag...@edenhill.se>
>         >>>>> wrote:
>         >>>>>
>         >>>>>> It is safe to assume that a previously undefined attributes
> bit
>         >> will
>         >>> be
>         >>>>>> unset in protocol requests from existing clients, if not,
> such a
>         >>> client
>         >>>>> is
>         >>>>>> already violating the protocol and needs to be fixed.
>         >>>>>>
>         >>>>>> So I dont see a need for a MagicByte bump, both broker and
> client
>         >> has
>         >>>> the
>         >>>>>> information it needs to construct or parse the message
> according to
>         >>>>> request
>         >>>>>> version.
>         >>>>>>
>         >>>>>>
>         >>>>>> 2016-10-25 18:48 GMT+02:00 Michael Pearce <
> michael.pea...@ig.com>:
>         >>>>>>
>         >>>>>>> Hi Magnus,
>         >>>>>>>
>         >>>>>>> I was wondering if I even needed to change those also, as
>         >>> technically
>         >>>>>>> we’re just making use of a non used attribute bit, but im
> not
>         >> 100%
>         >>>> that
>         >>>>>> it
>         >>>>>>> be always false currently.
>         >>>>>>>
>         >>>>>>> If someone can say 100% it will already be set false with
> current
>         >>> and
>         >>>>>>> historic bit wise masking techniques used over the time,
> we could
>         >>> do
>         >>>>> away
>         >>>>>>> with both, and simply just start to use it. Unfortunately
> I don’t
>         >>>> have
>         >>>>>> that
>         >>>>>>> historic knowledge so was hoping it would be flagged up in
> this
>         >>>>>> discussion
>         >>>>>>> thread ☺
>         >>>>>>>
>         >>>>>>> Cheers
>         >>>>>>> Mike
>         >>>>>>>
>         >>>>>>> On 10/25/16, 5:36 PM, "Magnus Edenhill" <
> mag...@edenhill.se>
>         >>> wrote:
>         >>>>>>>
>         >>>>>>>    Hi Michael,
>         >>>>>>>
>         >>>>>>>    With the version bumps for Produce and Fetch requests,
> do you
>         >>>>> really
>         >>>>>>> need
>         >>>>>>>    to bump MagicByte too?
>         >>>>>>>
>         >>>>>>>    Regards,
>         >>>>>>>    Magnus
>         >>>>>>>
>         >>>>>>>
>         >>>>>>>    2016-10-25 18:09 GMT+02:00 Michael Pearce <
>         >>> michael.pea...@ig.com
>         >>>>> :
>         >>>>>>>
>         >>>>>>>> Hi All,
>         >>>>>>>>
>         >>>>>>>> I would like to discuss the following KIP proposal:
>         >>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>         >>>>>>>> 87+-+Add+Compaction+Tombstone+Flag
>         >>>>>>>>
>         >>>>>>>> This is off the back of the discussion on KIP-82  / KIP
>         >>> meeting
>         >>>>>>> where it
>         >>>>>>>> was agreed to separate this issue and feature. See:
>         >>>>>>>> http://mail-archives.apache.org/mod_mbox/kafka-dev/201610
> .
>         >>>>>>>> mbox/%3cCAJS3ho8OcR==EcxsJ8OP99pD2hz=iiGecWsv-
>         >>>>>>>> EZsBsNyDcKr=g...@mail.gmail.com%3e
>         >>>>>>>>
>         >>>>>>>> Thanks
>         >>>>>>>> Mike
>         >>>>>>>>
>         >>>>>>>> The information contained in this email is strictly
>         >>>> confidential
>         >>>>>> and
>         >>>>>>> for
>         >>>>>>>> the use of the addressee only, unless otherwise indicated.
>         >> If
>         >>>> you
>         >>>>>>> are not
>         >>>>>>>> the intended recipient, please do not read, copy, use or
>         >>>> disclose
>         >>>>>> to
>         >>>>>>> others
>         >>>>>>>> this message or any attachment. Please also notify the
>         >> sender
>         >>>> by
>         >>>>>>> replying
>         >>>>>>>> to this email or by telephone (+44(020 7896 0011) and then
>         >>>> delete
>         >>>>>>> the email
>         >>>>>>>> and any copies of it. Opinions, conclusion (etc) that do
>         >> not
>         >>>>> relate
>         >>>>>>> to the
>         >>>>>>>> official business of this company shall be understood as
>         >>>> neither
>         >>>>>>> given nor
>         >>>>>>>> endorsed by it. IG is a trading name of IG Markets Limited
>         >> (a
>         >>>>>> company
>         >>>>>>>> registered in England and Wales, company number 04008957)
>         >> and
>         >>>> IG
>         >>>>>>> Index
>         >>>>>>>> Limited (a company registered in England and Wales,
> company
>         >>>>> number
>         >>>>>>>> 01190902). Registered address at Cannon Bridge House, 25
>         >>>> Dowgate
>         >>>>>>> Hill,
>         >>>>>>>> London EC4R 2YA. Both IG Markets Limited (register number
>         >>>> 195355)
>         >>>>>>> and IG
>         >>>>>>>> Index Limited (register number 114059) are authorised and
>         >>>>> regulated
>         >>>>>>> by the
>         >>>>>>>> Financial Conduct Authority.
>         >>>>>>>>
>         >>>>>>>
>         >>>>>>>
>         >>>>>>> The information contained in this email is strictly
> confidential
>         >>> and
>         >>>>> for
>         >>>>>>> the use of the addressee only, unless otherwise indicated.
> If you
>         >>> are
>         >>>>> not
>         >>>>>>> the intended recipient, please do not read, copy, use or
> disclose
>         >>> to
>         >>>>>> others
>         >>>>>>> this message or any attachment. Please also notify the
> sender by
>         >>>>> replying
>         >>>>>>> to this email or by telephone (+44(020 7896 0011) and then
> delete
>         >>> the
>         >>>>>> email
>         >>>>>>> and any copies of it. Opinions, conclusion (etc) that do
> not
>         >> relate
>         >>>> to
>         >>>>>> the
>         >>>>>>> official business of this company shall be understood as
> neither
>         >>>> given
>         >>>>>> nor
>         >>>>>>> endorsed by it. IG is a trading name of IG Markets Limited
> (a
>         >>> company
>         >>>>>>> registered in England and Wales, company number 04008957)
> and IG
>         >>>> Index
>         >>>>>>> Limited (a company registered in England and Wales, company
>         >> number
>         >>>>>>> 01190902). Registered address at Cannon Bridge House, 25
> Dowgate
>         >>>> Hill,
>         >>>>>>> London EC4R 2YA. Both IG Markets Limited (register number
> 195355)
>         >>> and
>         >>>>> IG
>         >>>>>>> Index Limited (register number 114059) are authorised and
>         >> regulated
>         >>>> by
>         >>>>>> the
>         >>>>>>> Financial Conduct Authority.
>         >>>>>>>
>         >>>>>>
>         >>>>>
>         >>>>>
>         >>>>>
>         >>>>> --
>         >>>>> Nacho (Ignacio) Solis
>         >>>>> Kafka
>         >>>>> nso...@linkedin.com
>         >>>>>
>         >>>>
>         >>>
>         >>
>         >
>         >
>         >
>         > --
>         > -Regards,
>         > Mayuresh R. Gharat
>         > (862) 250-7125
>
>
>
>     The information contained in this email is strictly confidential and
> for the use of the addressee only, unless otherwise indicated. If you are
> not the intended recipient, please do not read, copy, use or disclose to
> others this message or any attachment. Please also notify the sender by
> replying to this email or by telephone (+44(020 7896 0011) and then delete
> the email and any copies of it. Opinions, conclusion (etc) that do not
> relate to the official business of this company shall be understood as
> neither given nor endorsed by it. IG is a trading name of IG Markets
> Limited (a company registered in England and Wales, company number
> 04008957) and IG Index Limited (a company registered in England and Wales,
> company number 01190902). Registered address at Cannon Bridge House, 25
> Dowgate Hill, London EC4R 2YA. Both IG Markets Limited (register number
> 195355) and IG Index Limited (register number 114059) are authorised and
> regulated by the Financial Conduct Authority.
>
>
>

Reply via email to