Re: [DISCUSS] KIP-280: Enhanced log compaction

2018-10-22 Thread Luís Cabral
Since this is not moving forward, how about I proceed with the currently documented changes, and any improvements (such as configuration changes) can be taken up afterwards by whoever wants it under a different KIP? On Thursday, October 11, 2018, 9:47:12 AM GMT+2, Luís Cabral wrote

Re: [DISCUSS] KIP-280: Enhanced log compaction

2018-10-11 Thread Luís Cabral
force" to special-handle this case and not delete it. > > > > Guozhang > > > > Guozhang > > > On Wed, Aug 29, 2018 at 9:25 AM, Luís Cabral > wrote: > >> Hi all, >> >> Since there has been a rejuvenated interest in this KIP,

Re: [DISCUSS] KIP-280: Enhanced log compaction

2018-08-29 Thread Luís Cabral
r many of our projects. > > > > One piece of feedback that I have is that log.cleaner.compaction. > strategy > > and log.cleaner.compaction.strategy.header needs to be per topic.  The > > text of the KIP makes it sound that the config is only available for all > >

RE: [VOTE] KIP-280: Enhanced log compaction

2018-08-16 Thread Luís Cabral
n the hands of users. Have you considered something like that? Thanks, Jason On Sat, Aug 11, 2018 at 2:04 AM, Luís Cabral wrote: > Hi Jason, > > The initial (and still only) requirement I wanted out of this KIP was to > have the header strategy. > This is because I want to be able to

RE: [VOTE] KIP-280: Enhanced log compaction

2018-08-11 Thread Luís Cabral
ost recent snapshot" use case. Thanks, Jason On Thu, Aug 9, 2018 at 4:36 AM, Luís Cabral wrote: > Hi, > > > So, after a "short" break, I've finally managed to find time to resume > this KIP. Sorry to all for the delay. > > Continuing the conversation of the co

Re: [VOTE] KIP-280: Enhanced log compaction

2018-08-09 Thread Luís Cabral
sumer. In order for the > consumer > > > not to take the outdated record, the consumer should cache the deletion > > > tombstone for some configured amount of time. We ca probably piggyback > > this > > > on log.cleaner.delete.retention.ms, but we need to document t

RE: [VOTE] KIP-280: Enhanced log compaction

2018-07-04 Thread Luís Cabral
llet point will have to be added to the KIP for this (after I’ve found the time to review the portion of the code that enacts this behaviour). Kind Regards, Luís Cabral From: Jun Rao Sent: 03 July 2018 23:58 To: dev Subject: Re: [VOTE] KIP-280: Enhanced log compaction Hi, Luis, Thanks f

RE: [VOTE] KIP-280: Enhanced log compaction

2018-07-04 Thread Luís Cabral
Hi Jason, There’s a “Motivation” chapter in the KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-280%3A+Enhanced+log+compaction#KIP-280:Enhancedlogcompaction-Motivation Is it still unclear after reading that? Kind Regards, Luís Cabral From: Jason Gustafson Sent: 03 July 2018 23:45

Re: [VOTE] KIP-280: Enhanced log compaction

2018-07-02 Thread Luís Cabral
P was updated to only include the global configuration properties, removing the per-topic configs. I'll soon update the PR according to the documentation, but I trust the KIP doesn't need that to close, right? Cheers, Luis On Monday, July 2, 2018, 2:00:08 PM GMT+2, Luís Cabral

Re: [VOTE] KIP-280: Enhanced log compaction

2018-07-02 Thread Luís Cabral
rding the KIP itself, both Matthias and myself can recast our votes to > the updated wiki, while we still need one more committer to vote according > to the bylaws. > > > Guozhang > > On Thu, Jun 28, 2018 at 5:38 AM, Luís Cabral > wrote: > >>  Hi, >> >> Th

Re: [VOTE] KIP-280: Enhanced log compaction

2018-06-28 Thread Luís Cabral
uring the final stage of this KIP, so are you all alright with me changing the status to [ACCEPTED]? Cheers, Luis On Thursday, June 28, 2018, 2:08:11 PM GMT+2, Ted Yu wrote: +1 On Thu, Jun 28, 2018 at 4:56 AM, Luís Cabral wrote: > Hi Ted, > Can I also get your input on thi

Re: [VOTE] KIP-280: Enhanced log compaction

2018-06-28 Thread Luís Cabral
Hi Ted, Can I also get your input on this? bq. +1 from my side for using `compaction.strategy` with values "offset","timestamp" and "header" and `compaction.strategy.header` -Matthias bq. +1 from me as well. -Guozhang  Cheers, Luis

Re: [VOTE] KIP-280: Enhanced log compaction

2018-06-25 Thread Luís Cabral
th > `header=`. Using `_timestamp_`, `_offset_`, and `` might be > good enough (even if this is the solution I like least)---for this case, > we should state explicitly, that the whole space of `_*_` is reserved > and users are not allowed to set those for header compaction. In fac

RE: [VOTE] KIP-280: Enhanced log compaction

2018-06-18 Thread Luís Cabral
rved as "offset" and "timestamp". Guozhang On Mon, Jun 18, 2018 at 1:40 PM, Luís Cabral wrote: > Hi Guozhang, > > Yes, that is what I meant (separate configs). > Though I would still prefer to keep it as it is, as its a much simpler and > cleaner approach – I’m n

RE: [VOTE] KIP-280: Enhanced log compaction

2018-06-18 Thread Luís Cabral
d? Guozhang On Mon, Jun 18, 2018 at 12:20 AM, Luís Cabral wrote: > Hi Ted / Guozhang / Matthias, > > @Ted: I've now added your argument to the "Rejected Alternatives" portion > of the KIP. Please keep in mind that I would like to keep this as backwards >

Re: [VOTE] KIP-280: Enhanced log compaction

2018-06-18 Thread Luís Cabral
ir >>>> already existing header names (they can just specify a new name). >>>>> >>>>> bq. Shouldn't the selection in header have higher precedence over the >>>> configuration ? >>>>> >>>>> Not sure what y

[VOTE] KIP-280: Enhanced log compaction

2018-06-09 Thread Luís Cabral
Hi all, Any takers on having a look at this KIP and voting on it? https://cwiki.apache.org/confluence/display/KAFKA/KIP-280%3A+Enhanced+log+compaction Cheers, Luis

[VOTE] KIP-280: Enhanced log compaction

2018-06-04 Thread Luís Cabral
Hi all, After a thorough discussion, this KIP is now ready to go into a vote:  KIP-280: Enhanced log compaction - Apache Kafka - Apache Software Foundation | | | | KIP-280: Enhanced log compaction - Apache Kafka - Apache Software Founda... | | | Kind Regards, Luís CabralOn

Re: [DISCUSS] KIP-280: Enhanced log compaction

2018-05-28 Thread Luís Cabral
27;ed), and maybe you guys can chime in here as well. > > > Guozhang > > > On Tue, May 22, 2018 at 6:45 AM, Luís Cabral > wrote: > >> Hi Matthias / Guozhang, >> >> Were the questions clarified? >> Please feel free to add more feedback, otherwise it wou

RE: [DISCUSS] KIP-280: Enhanced log compaction

2018-05-22 Thread Luís Cabral
Hi Matthias / Guozhang, Were the questions clarified? Please feel free to add more feedback, otherwise it would be nice to move this topic onwards 😊 Kind Regards, Luís Cabral From: Guozhang Wang Sent: 09 May 2018 20:00 To: dev@kafka.apache.org Subject: Re: [DISCUSS] KIP-280: Enhanced log

Re: [DISCUSS] KIP-280: Enhanced log compaction

2018-05-03 Thread Luís Cabral
pectation for 1) would be much common, and hence worth special handling it to be more effective in cleaning. WDYT? Guozhang On Wed, May 2, 2018 at 2:36 AM, Luís Cabral wrote: >  Hi Guozhang, > > Have you managed to have a look at my reply? > How do you feel about this? >

Re: [DISCUSS] KIP-280: Enhanced log compaction

2018-05-02 Thread Luís Cabral
Hi Guozhang, Have you managed to have a look at my reply? How do you feel about this? Kind Regards, Luís Cabral On Monday, April 30, 2018, 9:27:15 AM GMT+2, Luís Cabral wrote: Hi Guozhang, I understand the argument, but this is a hazardous compromise for using Kafka as an event

Re: [DISCUSS] KIP-280: Enhanced log compaction

2018-04-30 Thread Luís Cabral
versioned record is appended it will be deleted. Does that make sense? Guozhang On Fri, Apr 27, 2018 at 4:20 AM, Luís Cabral wrote: >  Hi, > > I was updating the PR to match the latest decisions and noticed (or > rather, the integration tests noticed) that without storing the offset, &

Re: [DISCUSS] KIP-280: Enhanced log compaction

2018-04-27 Thread Luís Cabral
t in the cache, unfortunately, and the binary footprint increases two-fold when "offset" is not used as a compaction strategy. Guozhang: Is it ok with you if we go back on this decision and leave the offset as a tie-breaker? Kind Regards,Luis On Friday, April 27, 2018, 11:11:55

Re: [DISCUSS] KIP-280: Enhanced log compaction

2018-04-27 Thread Luís Cabral
makes sense. I'm happy to take my suggestion back and enforce only long typed fields. Guozhang On Thu, Apr 26, 2018 at 1:44 AM, Luís Cabral wrote: >  Hi, > > bq. have a integer typed OffsetMap (for offset) > > Offset is an integer? I've only noticed it being resolv

Re: [DISCUSS] KIP-280: Enhanced log compaction

2018-04-26 Thread Luís Cabral
Hi, bq. have a integer typed OffsetMap (for offset) Offset is an integer? I've only noticed it being resolved as a long so far. bq. long typed OffsetMap (for timestamp) We would still need to store the offset, as it is functioning as a tie-breaker.  Not that this is a big deal, we can be easi

Re: RE: RE: [DISCUSS] KIP-280: Enhanced log compaction

2018-04-24 Thread Luís Cabral
it to be 8 bytes). Do you have any suggestions on how to handle this issue there? Kind Regards, Luis On Tuesday, April 24, 2018, 1:11:11 AM GMT+2, Luís Cabral wrote: #yiv6853119978 #yiv6853119978 -- _filtered #yiv6853119978 {panose-1:2 4 5 3 5 4 6 3 2 4;} _filtered #yiv6853119978

RE: RE: [DISCUSS] KIP-280: Enhanced log compaction

2018-04-23 Thread Luís Cabral
; and their corresponding byte arrays are considered equal, then the record with the highest offset will be kept;` Guozhang On Mon, Apr 23, 2018 at 1:54 PM, Luís Cabral wrote: > Hello Guozhang, > > The KIP is now updated to reflect this choice in strategy. > Please let me know

RE: RE: [DISCUSS] KIP-280: Enhanced log compaction

2018-04-23 Thread Luís Cabral
ain, as I said, we push this responsibility to users to define the right serde mechanism, but that seems to be more flexible). For example: -INF serialized to 0x, -INF+1 serialized to 0x0001, etc. Guozhang On Mon, Apr 23, 2018 at 10:19 AM, Luís Cabral wrote: > Hello Guozhang,

RE: RE: [DISCUSS] KIP-280: Enhanced log compaction

2018-04-23 Thread Luís Cabral
rding to > the byte array lexico-ordering to full fill their ordering semantics. It is > more flexible to enforce users to encode their compaction field always as a > long type. Let me know WDYT. > > > > Also I have some minor comments on the wiki itself: > > 1) "Wh

Re: RE: [DISCUSS] KIP-280: Enhanced log compaction

2018-04-23 Thread Luís Cabral
eld always as a long type. Let me know WDYT. Also I have some minor comments on the wiki itself: 1) "When both records being compared contain a matching "compaction value", then the record with the highest offset will be kept;" Should it be "the record with the highe

Re: RE: [DISCUSS] KIP-280: Enhanced log compaction

2018-04-20 Thread Luís Cabral
Guozhang, is this reply ok with you? If you insist on the byte[] comparison directly, then I would need some suggestions on how to represent a "version" with it, and then the KIP could be changed to that. On Tuesday, April 17, 2018, 2:44:16 PM GMT+2, Luís Cabral wrote: Oo

Re: RE: [DISCUSS] KIP-280: Enhanced log compaction

2018-04-17 Thread Luís Cabral
Oops, missed that email... bq. It is because when we compare the bytes we do not treat them as longs atall, so we just compare them based on bytes; I admit that if users's headertypes have some semantic meanings (e.g. it is encoded from a long) they weare forcing them to choose the encoder that

Re: RE: [DISCUSS] KIP-280: Enhanced log compaction

2018-04-17 Thread Luís Cabral
Hi all, There aren't that many discussions on this KIP, does that mean it should now move to voting? I'm not sure on the process here... Cheers

Re: RE: [DISCUSS] KIP-280: Enhanced log compaction

2018-04-11 Thread Luís Cabral
Hi Guozhang, bq. I'm not sure I understand you statement that it is used to determine the "version" of the record I do not mean that it is "used", but if what you meant is that you would prefer to use that field instead of a header? This is in relation to a previous point of yours: >>> 1) I'm

Re: RE: [DISCUSS] KIP-280: Enhanced log compaction

2018-04-11 Thread Luís Cabral
Hi all, On my own previous statement: bq. Not that I mind doing it directly (I intend to use a Java client), but please be aware that a String binary representation is based on the charset encoding, while the Long binary representation varies according to the language. I went back to double

RE: [DISCUSS] KIP-280: Enhanced log compaction

2018-04-10 Thread Luís Cabral
on about this. > > > > It might also be good, to elaborate why you suggest "long" for the > > compaction value is the KIP itself. > > > > One more though: the KIP basically allows, that a record with larger > > offset is deleted while a record with smaller

RE: [DISCUSS] KIP-280: Enhanced log compaction

2018-04-09 Thread Luís Cabral
he value -- but I am not sure if we would need this flexibility. It also make serializing the value on the client side more complex. Whatever the decision is, the KIP should explain that value format is expected in detail. -Matthias On 4/9/18 2:20 AM, Luís Cabral wrote: > Hi, > > >

Re: [DISCUSS] KIP-280: Enhanced log compaction

2018-04-09 Thread Luís Cabral
In the pull request used with >> the KIP you can see that it is indeed using the offset as a tie-breaker in >> case the header values are the same. >> I’ll make this clear by adding it as part of the proposed changes. Think you forgot to actually add this case. :) One more

RE: [DISCUSS] KIP-280: Enhanced log compaction

2018-04-08 Thread Luís Cabral
at should the behavior be, if a message does not encode the "compaction key" in the header? -Matthias On 4/5/18 11:59 PM, Luís Cabral wrote: > > Thank you very much for taking the time to read it. > > bq. In the 'Proposed Changes' section, can you expa

Re: [DISCUSS] KIP-280: Enhanced log compaction

2018-04-06 Thread Luís Cabral
- it is better to avoid collision. Cheers On Thu, Apr 5, 2018 at 2:05 AM, Luís Cabral wrote: > > This is embarassingly hard to fix... going again... > > KIP-280:  https://cwiki.apache.org/confluence/display/ > KAFKA/KIP-280%3A+Enhanced+log+compaction > - > Pull-4822: 

Re: [DISCUSS] KIP-280: Enhanced log compaction

2018-04-05 Thread Luís Cabral
This is embarassingly hard to fix... going again... KIP-280:   https://cwiki.apache.org/confluence/display/KAFKA/KIP-280%3A+Enhanced+log+compaction - Pull-4822:  https://github.com/apache/kafka/pull/4822 On Thursday, April 5, 2018, 11:03:22 AM GMT+2, Luís Cabral wrote

Re: [DISCUSS] KIP-280: Enhanced log compaction

2018-04-05 Thread Luís Cabral
 Fixing the links:KIP-280:   https://cwiki.apache.org/confluence/display/KAFKA/KIP-280%3A+Enhanced+log+compactionPull-4822:   https://github.com/apache/kafka/pull/4822 On 2018/04/0508:44:00, Luís Cabral wrote: > Helloall,> > Starting adiscussion for this feature.> >KIP-2

[DISCUSS] KIP-280: Enhanced log compaction

2018-04-05 Thread Luís Cabral
Hello all, Starting a discussion for this feature. KIP-280   :  https://cwiki.apache.org/confluence/display/KAFKA/KIP-280%3A+Enhanced+log+compactionPull-4822 :  https://github.com/apache/kafka/pull/4822 Kind Regards,Luís

Permissions to create a KIP

2018-04-04 Thread Luís Cabral
Hi, As advised on the proposed feature submission referenced below, I seem to first need to create a KIP on your confluence page.Could you kindly grant me the required privileges to create a KIP? ID: blaghedEmail: Thank you in advance!Kind Regards,Luís Cabral Ref.:https://github.com/apache

Permissions to create a KIP

2018-04-04 Thread Luís Cabral
As advised on the proposed feature submission "https://github.com/apache/kafka/pull/4822";, I seem to first need to create a KIP on your confluence page. Could you kindly grant me the required privileges to create a KIP? https://cwiki.apache.org/confluence/users/viewuserprofile.action?username=

Permissions to create a KIP

2018-04-04 Thread Luís Cabral
As advised on the proposed feature submission referenced below, I seem to first need to create a KIP on your confluence page. Could you kindly grant me the required privileges to create a KIP? Wiki ID: blaghed Thank you in advance! Ref.: https://github.com/apache/kafka/pull/4822