Hi Ted / Guozhang / Matthias, @Ted: I've now added your argument to the "Rejected Alternatives" portion of the KIP. Please keep in mind that I would like to keep this as backwards compatible as possible, so a lot of decisions are inferred from that intent.
@Guozhang: IMHO, adding expression evaluation to configuration is an incorrect approach. If you absolutely insist on having this clear distinction between header/key, then I would suggest instead to have a dedicated property for the "key" part. Of course, this is your project so I'll just continue whatever approach moves this KIP forward... @Matthias: Sorry, but update the KIP according to what? Cheers, Luís On Monday, June 18, 2018, 2:55:17 AM GMT+2, Matthias J. Sax <matth...@confluent.io> wrote: Well, for "offset" and "timestamp" policy, not communication between both is required. Only if headers are used, user A should communicate the corresponding header key to user B. @Luis: can you update the KIP accordingly? -Matthias On 6/17/18 5:36 PM, Ted Yu wrote: > My previous reply was just an alternative for consideration. > > bq. than a second user B can add a header with key "offset" and thus break > the intention of user A > > I didn't see such scenario after reading the KIP. Maybe add this as > reasoning for the current approach ? > > I wonder how user B gets to know the intention of user A. Meaning, if user > B doesn't follow the norm set by user A, there still would be issue, right ? > > > On Sun, Jun 17, 2018 at 4:58 PM, Matthias J. Sax <matth...@confluent.io> > wrote: > >> Let me rephrase your answer to make sure I understand what you suggest: >> >> If compaction strategy is configured to use "offset", and if there is a >> header in the record with `key == offset`, than we should use the value >> of the record header instead of the actual record offset? >> >> Do I understand this correctly? If yes, what is the advantage of doing >> this? From my point of view, it might be problematic, because if user A >> creates a topic and configures "offset" compaction (with the intend that >> the record offset should be uses), than a second user B can add a header >> with key "offset" and thus break the intention of user A. >> >> Also, if existing topics might have data with record header key >> "offset", the change would not be backward compatible either. >> >> >> -Matthias >> >> On 6/16/18 6:59 PM, Ted Yu wrote: >>> Pardon the brevity in my previous reply. >>> I was talking about this bullet: >>> >>> bq. When this configuration is set to anything other than "*offset*" or " >>> *timestamp*", then the record headers are scanned for a key matching this >>> value. >>> >>> My point is that if matching key in the header is found, its value should >>> take precedence over the value of the configuration. >>> I understand that such interpretation may have slight performance cost. >>> >>> Cheers >>> >>> On Sat, Jun 16, 2018 at 6:29 PM, Matthias J. Sax <matth...@confluent.io> >>> wrote: >>> >>>> Ted, >>>> >>>> I am also not sure what you mean by "Shouldn't the selection in header >>>> have higher precedence over the configuration"? What selection do you >>>> mean? And want configuration? >>>> >>>> >>>> About the first point, I think this is actually a valid concern: To >>>> address this issue, it seems that we would need to change the accepted >>>> format of the config. Instead of "offset", "timestamp", "<header-key>", >>>> we could replace the last one with "header=<header-key>". >>>> >>>> WDYT? >>>> >>>> >>>> -Matthias >>>> >>>> On 6/15/18 3:06 AM, Ted Yu wrote: >>>>> If selection exists in header, the selection should override the config >>>> value. >>>>> Cheers >>>>> -------- Original message --------From: Luis Cabral >>>> <luis_cab...@yahoo.com.INVALID> Date: 6/15/18 1:40 AM (GMT-08:00) To: >>>> dev@kafka.apache.org Subject: Re: [VOTE] KIP-280: Enhanced log >> compaction >>>>> Hi, >>>>> >>>>> bq. Can the value be determined now ? My thinking is that what if there >>>> is a third compaction strategy proposed in the future ? We should guard >>>> against user unknowingly choosing the 'future' strategy. >>>>> >>>>> The idea is that the header name to use is flexible, which protects >>>> current clients that may want to use this from having to adapt their >>>> already existing header names (they can just specify a new name). >>>>> >>>>> bq. Shouldn't the selection in header have higher precedence over the >>>> configuration ? >>>>> >>>>> Not sure what you mean here, could you clarify? >>>>> >>>>> bq. Please create JIRA if you haven't already. >>>>> >>>>> Done: https://issues.apache.org/jira/browse/KAFKA-7061 >>>>> >>>>> Cheers, >>>>> Luís >>>>> >>>>>> On 11 Jun 2018, at 01:50, Ted Yu <yuzhih...@gmail.com> wrote: >>>>>> >>>>>> bq. When this configuration is set to anything other than "*offset*" >> or >>>> " >>>>>> *timestamp*", then the record headers are scanned for a key matching >>>> this >>>>>> value. >>>>>> >>>>>> Can the value be determined now ? My thinking is that what if there >> is a >>>>>> third compaction strategy proposed in the future ? We should guard >>>> against >>>>>> user unknowingly choosing the 'future' strategy. >>>>>> >>>>>> bq. If this header is found >>>>>> >>>>>> Shouldn't the selection in header have higher precedence over the >>>> configuration >>>>>> ? >>>>>> >>>>>> Please create JIRA if you haven't already. >>>>>> >>>>>> Thanks >>>>>> >>>>>> On Sat, Jun 9, 2018 at 12:39 AM, Luís Cabral >>>> <luis_cab...@yahoo.com.invalid> >>>>>> wrote: >>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> Any takers on having a look at this KIP and voting on it? >>>>>>> >>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- >>>>>>> 280%3A+Enhanced+log+compaction >>>>>>> >>>>>>> Cheers, >>>>>>> Luis >>>>>>> >>>>> >>>> >>>> >>> >> >> >