Becket, Since you submitted KIP-33, are you actively working on that? If so, it would make sense to implement KIP-47 after KIP-33 so that it works for both CreateTime and LogAppendTime.
Thanks, Jun On Fri, Feb 19, 2016 at 6:25 PM, Bill Warshaw <wdwars...@gmail.com> wrote: > Hi Jun, > > 1. I thought more about Andrew's comment about LogAppendTime. The > time-based index you are referring to is associated with KIP-33, correct? > Currently my implementation is just checking the last message in a segment, > so we're restricted to LogAppendTime. When the work for KIP-33 is > completed, it sounds like CreateTime would also be valid. Do you happen to > know if anyone is currently working on KIP-33? > > 2. I did update the wiki after reading your original comment, but reading > over it again I realize I could word a couple things more clearly. I will > do that tonight. > > Bill > > On Fri, Feb 19, 2016 at 7:02 PM, Jun Rao <j...@confluent.io> wrote: > > > Hi, Bill, > > > > I replied with the following comments earlier to the thread. Did you see > > that? > > > > Thanks for the proposal. A couple of comments. > > > > 1. It seems that this new policy should work for CreateTime as well. If a > > topic is configured with CreateTime, messages may not be added in strict > > order in the log. However, to build a time-based index, we will be > > maintaining the largest timestamp for all messages in a log segment. We > can > > delete a segment if its largest timestamp is less than > > log.retention.min.timestamp. This guarantees that no messages newer than > > log.retention.min.timestamp will be deleted, which is probably what the > > user wants. > > > > 2. Right now, the user can specify "delete" as the retention policy and a > > log segment will be deleted either when the size of a partition exceeds a > > threshold or the timestamp of a segment is older than a relative period > of > > time (say 7 days) from now. What you are proposing is not a new retention > > policy, but an additional check that will cause a segment to be deleted > > when the timestamp of a segment is older than an absolute timestamp? If > so, > > could you update the wiki accordingly? > > > > Thanks, > > > > Jun > > > > On Fri, Feb 19, 2016 at 2:57 PM, Bill Warshaw <wdwars...@gmail.com> > wrote: > > > > > Hello all, > > > > > > What is the next step with this proposal? The work for KIP-32 that it > > was > > > based off merged earlier today ( > https://github.com/apache/kafka/pull/764 > > , > > > thank you Becket). I have an implementation with tests, and I've > > confirmed > > > that it actually works in a live system. Is there more discussion that > > > needs to be had about this KIP, or should I start a VOTE thread? > > > > > > > > > > > > On Tue, Feb 16, 2016 at 5:06 PM, Jun Rao <j...@confluent.io> wrote: > > > > > > > Bill, > > > > > > > > Thanks for the proposal. A couple of comments. > > > > > > > > 1. It seems that this new policy should work for CreateTime as well. > > If a > > > > topic is configured with CreateTime, messages may not be added in > > strict > > > > order in the log. However, to build a time-based index, we will be > > > > maintaining the largest timestamp for all messages in a log segment. > We > > > can > > > > delete a segment if its largest timestamp is less than > > > > log.retention.min.timestamp. This guarantees that no messages newer > > than > > > > log.retention.min.timestamp will be deleted, which is probably what > the > > > > user wants. > > > > > > > > 2. Right now, the user can specify "delete" as the retention policy > > and a > > > > log segment will be deleted either when the size of a partition > > exceeds a > > > > threshold or the timestamp of a segment is older than a relative > period > > > of > > > > time (say 7 days) from now. What you are proposing is not a new > > retention > > > > policy, but an additional check that will cause a segment to be > deleted > > > > when the timestamp of a segment is older than an absolute timestamp? > If > > > so, > > > > could you update the wiki accordingly? > > > > > > > > Jun > > > > > > > > > > > > > > > > On Sat, Feb 13, 2016 at 3:23 PM, Bill Warshaw <wdwars...@gmail.com> > > > wrote: > > > > > > > > > Hello, > > > > > > > > > > That is a good catch, thanks for pointing it out. If this KIP is > > > > accepted, > > > > > we'd need to document this and make the log cleaner not run > > > > timestamp-based > > > > > deletion unless message.timestamp.type=LogAppendTime. > > > > > > > > > > On Sat, Feb 13, 2016 at 5:38 AM, Andrew Schofield < > > > > > andrew_schofield_j...@outlook.com> wrote: > > > > > > > > > > > This KIP is related to KIP-32, but I strikes me that it only > makes > > > > sense > > > > > > with one of the two proposed message timestamp types. If I > > understand > > > > > > correctly, message timestamps are only certain to be > monotonically > > > > > > increasing in the log if message.timestamp.type=LogAppendTime. > > > > > > > > > > > > > > > > > > > > > > > > Does timestamp-based auto-expiration require use of > > > > > > message.timestamp.type=LogAppendTime? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I think this KIP is a good idea, but I think it relies on strict > > > > ordering > > > > > > of timestamps to be workable. > > > > > > > > > > > > > > > > > > > > > > > > Andrew Schofield > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Date: Fri, 12 Feb 2016 10:38:46 -0800 > > > > > > > Subject: Re: [DISCUSS] KIP-47 - Add timestamp-based log > deletion > > > > policy > > > > > > > From: n...@confluent.io > > > > > > > To: dev@kafka.apache.org > > > > > > > > > > > > > > Adding a timestamp based auto-expiration is useful and this > > > proposal > > > > > > makes > > > > > > > sense. Thx! > > > > > > > > > > > > > > On Wed, Feb 10, 2016 at 3:35 PM, Jay Kreps wrote: > > > > > > > > > > > > > >> I think this makes a lot of sense and won't be hard to > implement > > > and > > > > > > >> doesn't create too much in the way of new interfaces. > > > > > > >> > > > > > > >> -Jay > > > > > > >> > > > > > > >> On Tue, Feb 9, 2016 at 8:13 AM, Bill Warshaw wrote: > > > > > > >> > > > > > > >>> Hello, > > > > > > >>> > > > > > > >>> I just submitted KIP-47 for adding a new log deletion policy > > > based > > > > > on a > > > > > > >>> minimum timestamp of messages to retain. > > > > > > >>> > > > > > > >>> > > > > > > >>> > > > > > > >> > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-47+-+Add+timestamp-based+log+deletion+policy > > > > > > >>> > > > > > > >>> I'm open to any comments or suggestions. > > > > > > >>> > > > > > > >>> Thanks, > > > > > > >>> Bill Warshaw > > > > > > >>> > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Thanks, > > > > > > > Neha > > > > > > > > > > > > > > > > > > > > >