Bumping this thread so Wes can reply to it. Ignore this mail.

2016-02-24 0:36 GMT+01:00 Joel Koshy <jjkosh...@gmail.com>:

> Great - thanks for clarifying.
>
> Joel
>
> On Tue, Feb 23, 2016 at 1:47 PM, Bill Warshaw <wdwars...@gmail.com> wrote:
>
> > Sorry that I didn't see this comment before the meeting Joel.  I'll try
> to
> > clarify what I said at the meeting:
> >
> > - The KIP currently states that timestamp-based log deletion will only
> work
> > with LogAppendTime.  I need to update the KIP to reflect that, after the
> > work is done for KIP-33, it will work with both LogAppendTime and
> > CreateTime.
> > - To use the existing time-based retention mechanism to delete a precise
> > range of messages, a client application would need to do the following:
> >   - by default, turn off these retention mechanisms
> >   - when the application wishes to delete a range of messages which were
> > sent before a certain time, compute an approximate value to set
> > "log.retention.minutes" to, to create a window of messages based on that
> > timestamp that are ok to delete.  There is some degree of imprecision
> > implied here.
> >   - wait until we are confident that the log retention mechanism has been
> > run and deleted any stale segments
> >   - reset "log.retention.minutes" to turn off time-based log retention
> > until the next time the client application wants to delete something
> >
> > - To use the proposed timestamp-based retention mechanism, there is only
> > one step: the application just has to set "log.retention.min.timestamp"
> to
> > whatever time boundary it deems fit.  It doesn't need to compute any
> fuzzy
> > windows, try to wait until asynchronous processes have been completed or
> > continually flip settings between enabled and disabled.
> >
> > I will update the KIP to reflect the discussion around LogAppendTime vs
> > CreateTime and the work being done in KIP-33.
> >
> > Thanks,
> > Bill
> >
> >
> > On Tue, Feb 23, 2016 at 1:22 PM, Joel Koshy <jjkosh...@gmail.com> wrote:
> >
> > > I'm having some trouble reconciling the current proposal with your
> > original
> > > requirement which was essentially being able to purge log data up to a
> > > precise point (an offset). The KIP currently suggests that
> > timestamp-based
> > > deletion would only work with LogAppendTime, so it does not seem
> > > significantly different from time-based retention (after KIP-32/33) -
> IOW
> > > to me it appears that you would need to use CreateTime and not
> > > LogAppendTime. Also one of the rejected alternatives observes that
> > changing
> > > the existing configuration settings to try to flush ranges of a given
> > > partition's log are problematic, but it seems to me you would have to
> do
> > > this in with timestamp-based deletion as well right? I think it would
> be
> > > useful for me if you or anyone else can go over the exact
> > > mechanics/workflow for accomplishing precise purges at today's KIP
> > meeting.
> > >
> > > Thanks,
> > >
> > > Joel
> > >
> > > On Monday, February 22, 2016, Bill Warshaw <wdwars...@gmail.com>
> wrote:
> > >
> > > > Sounds good.  I'll hold off on sending out a VOTE thread until after
> > the
> > > > KIP meeting tomorrow.
> > > >
> > > > On Mon, Feb 22, 2016 at 12:56 PM, Becket Qin <becket....@gmail.com>
> > > wrote:
> > > >
> > > > > Hi Jun,
> > > > >
> > > > > I think it makes sense to implement KIP-47 after KIP-33 so we can
> > make
> > > it
> > > > > work for both LogAppendTime and CreateTime.
> > > > >
> > > > > And yes, I'm actively working on KIP-33. I had a voting thread on
> > > KIP-33
> > > > > before and I'll bump it up.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jiangjie (Becket) Qin
> > > > >
> > > > >
> > > > >
> > > > > On Mon, Feb 22, 2016 at 9:11 AM, Jun Rao <j...@confluent.io> wrote:
> > > > >
> > > > > > Becket,
> > > > > >
> > > > > > Since you submitted KIP-33, are you actively working on that? If
> > so,
> > > it
> > > > > > would make sense to implement KIP-47 after KIP-33 so that it
> works
> > > for
> > > > > both
> > > > > > CreateTime and LogAppendTime.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jun
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Fri, Feb 19, 2016 at 6:25 PM, Bill Warshaw <
> wdwars...@gmail.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > Hi Jun,
> > > > > > >
> > > > > > > 1.  I thought more about Andrew's comment about LogAppendTime.
> > The
> > > > > > > time-based index you are referring to is associated with
> KIP-33,
> > > > > correct?
> > > > > > > Currently my implementation is just checking the last message
> in
> > a
> > > > > > segment,
> > > > > > > so we're restricted to LogAppendTime.  When the work for KIP-33
> > is
> > > > > > > completed, it sounds like CreateTime would also be valid.  Do
> you
> > > > > happen
> > > > > > to
> > > > > > > know if anyone is currently working on KIP-33?
> > > > > > >
> > > > > > > 2. I did update the wiki after reading your original comment,
> but
> > > > > reading
> > > > > > > over it again I realize I could word a couple things more
> > > clearly.  I
> > > > > > will
> > > > > > > do that tonight.
> > > > > > >
> > > > > > > Bill
> > > > > > >
> > > > > > > On Fri, Feb 19, 2016 at 7:02 PM, Jun Rao <j...@confluent.io>
> > wrote:
> > > > > > >
> > > > > > > > Hi, Bill,
> > > > > > > >
> > > > > > > > I replied with the following comments earlier to the thread.
> > Did
> > > > you
> > > > > > see
> > > > > > > > that?
> > > > > > > >
> > > > > > > > Thanks for the proposal. A couple of comments.
> > > > > > > >
> > > > > > > > 1. It seems that this new policy should work for CreateTime
> as
> > > > well.
> > > > > > If a
> > > > > > > > topic is configured with CreateTime, messages may not be
> added
> > in
> > > > > > strict
> > > > > > > > order in the log. However, to build a time-based index, we
> will
> > > be
> > > > > > > > maintaining the largest timestamp for all messages in a log
> > > > segment.
> > > > > We
> > > > > > > can
> > > > > > > > delete a segment if its largest timestamp is less than
> > > > > > > > log.retention.min.timestamp. This guarantees that no messages
> > > newer
> > > > > > than
> > > > > > > > log.retention.min.timestamp will be deleted, which is
> probably
> > > what
> > > > > the
> > > > > > > > user wants.
> > > > > > > >
> > > > > > > > 2. Right now, the user can specify "delete" as the retention
> > > policy
> > > > > > and a
> > > > > > > > log segment will be deleted either when the size of a
> partition
> > > > > > exceeds a
> > > > > > > > threshold or the timestamp of a segment is older than a
> > relative
> > > > > period
> > > > > > > of
> > > > > > > > time (say 7 days) from now. What you are proposing is not a
> new
> > > > > > retention
> > > > > > > > policy, but an additional check that will cause a segment to
> be
> > > > > deleted
> > > > > > > > when the timestamp of a segment is older than an absolute
> > > > timestamp?
> > > > > If
> > > > > > > so,
> > > > > > > > could you update the wiki accordingly?
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Jun
> > > > > > > >
> > > > > > > > On Fri, Feb 19, 2016 at 2:57 PM, Bill Warshaw <
> > > wdwars...@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hello all,
> > > > > > > > >
> > > > > > > > > What is the next step with this proposal?  The work for
> > KIP-32
> > > > that
> > > > > > it
> > > > > > > > was
> > > > > > > > > based off merged earlier today (
> > > > > > > https://github.com/apache/kafka/pull/764
> > > > > > > > ,
> > > > > > > > > thank you Becket).  I have an implementation with tests,
> and
> > > I've
> > > > > > > > confirmed
> > > > > > > > > that it actually works in a live system.  Is there more
> > > > discussion
> > > > > > that
> > > > > > > > > needs to be had about this KIP, or should I start a VOTE
> > > thread?
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Tue, Feb 16, 2016 at 5:06 PM, Jun Rao <j...@confluent.io
> >
> > > > wrote:
> > > > > > > > >
> > > > > > > > > > Bill,
> > > > > > > > > >
> > > > > > > > > > Thanks for the proposal. A couple of comments.
> > > > > > > > > >
> > > > > > > > > > 1. It seems that this new policy should work for
> CreateTime
> > > as
> > > > > > well.
> > > > > > > > If a
> > > > > > > > > > topic is configured with CreateTime, messages may not be
> > > added
> > > > in
> > > > > > > > strict
> > > > > > > > > > order in the log. However, to build a time-based index,
> we
> > > will
> > > > > be
> > > > > > > > > > maintaining the largest timestamp for all messages in a
> log
> > > > > > segment.
> > > > > > > We
> > > > > > > > > can
> > > > > > > > > > delete a segment if its largest timestamp is less than
> > > > > > > > > > log.retention.min.timestamp. This guarantees that no
> > messages
> > > > > newer
> > > > > > > > than
> > > > > > > > > > log.retention.min.timestamp will be deleted, which is
> > > probably
> > > > > what
> > > > > > > the
> > > > > > > > > > user wants.
> > > > > > > > > >
> > > > > > > > > > 2. Right now, the user can specify "delete" as the
> > retention
> > > > > policy
> > > > > > > > and a
> > > > > > > > > > log segment will be deleted either when the size of a
> > > partition
> > > > > > > > exceeds a
> > > > > > > > > > threshold or the timestamp of a segment is older than a
> > > > relative
> > > > > > > period
> > > > > > > > > of
> > > > > > > > > > time (say 7 days) from now. What you are proposing is
> not a
> > > new
> > > > > > > > retention
> > > > > > > > > > policy, but an additional check that will cause a segment
> > to
> > > be
> > > > > > > deleted
> > > > > > > > > > when the timestamp of a segment is older than an absolute
> > > > > > timestamp?
> > > > > > > If
> > > > > > > > > so,
> > > > > > > > > > could you update the wiki accordingly?
> > > > > > > > > >
> > > > > > > > > > Jun
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Sat, Feb 13, 2016 at 3:23 PM, Bill Warshaw <
> > > > > wdwars...@gmail.com
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hello,
> > > > > > > > > > >
> > > > > > > > > > > That is a good catch, thanks for pointing it out.  If
> > this
> > > > KIP
> > > > > is
> > > > > > > > > > accepted,
> > > > > > > > > > > we'd need to document this and make the log cleaner not
> > run
> > > > > > > > > > timestamp-based
> > > > > > > > > > > deletion unless message.timestamp.type=LogAppendTime.
> > > > > > > > > > >
> > > > > > > > > > > On Sat, Feb 13, 2016 at 5:38 AM, Andrew Schofield <
> > > > > > > > > > > andrew_schofield_j...@outlook.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > > This KIP is related to KIP-32, but I strikes me that
> it
> > > > only
> > > > > > > makes
> > > > > > > > > > sense
> > > > > > > > > > > > with one of the two proposed message timestamp types.
> > If
> > > I
> > > > > > > > understand
> > > > > > > > > > > > correctly, message timestamps are only certain to be
> > > > > > > monotonically
> > > > > > > > > > > > increasing in the log if
> > > > > message.timestamp.type=LogAppendTime.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Does timestamp-based auto-expiration require use of
> > > > > > > > > > > > message.timestamp.type=LogAppendTime?
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > I think this KIP is a good idea, but I think it
> relies
> > on
> > > > > > strict
> > > > > > > > > > ordering
> > > > > > > > > > > > of timestamps to be workable.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Andrew Schofield
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > Date: Fri, 12 Feb 2016 10:38:46 -0800
> > > > > > > > > > > > > Subject: Re: [DISCUSS] KIP-47 - Add timestamp-based
> > log
> > > > > > > deletion
> > > > > > > > > > policy
> > > > > > > > > > > > > From: n...@confluent.io
> > > > > > > > > > > > > To: dev@kafka.apache.org
> > > > > > > > > > > > >
> > > > > > > > > > > > > Adding a timestamp based auto-expiration is useful
> > and
> > > > this
> > > > > > > > > proposal
> > > > > > > > > > > > makes
> > > > > > > > > > > > > sense. Thx!
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, Feb 10, 2016 at 3:35 PM, Jay Kreps  wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > >> I think this makes a lot of sense and won't be
> hard
> > to
> > > > > > > implement
> > > > > > > > > and
> > > > > > > > > > > > >> doesn't create too much in the way of new
> > interfaces.
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> -Jay
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> On Tue, Feb 9, 2016 at 8:13 AM, Bill Warshaw
> wrote:
> > > > > > > > > > > > >>
> > > > > > > > > > > > >>> Hello,
> > > > > > > > > > > > >>>
> > > > > > > > > > > > >>> I just submitted KIP-47 for adding a new log
> > deletion
> > > > > > policy
> > > > > > > > > based
> > > > > > > > > > > on a
> > > > > > > > > > > > >>> minimum timestamp of messages to retain.
> > > > > > > > > > > > >>>
> > > > > > > > > > > > >>>
> > > > > > > > > > > > >>>
> > > > > > > > > > > > >>
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-47+-+Add+timestamp-based+log+deletion+policy
> > > > > > > > > > > > >>>
> > > > > > > > > > > > >>> I'm open to any comments or suggestions.
> > > > > > > > > > > > >>>
> > > > > > > > > > > > >>> Thanks,
> > > > > > > > > > > > >>> Bill Warshaw
> > > > > > > > > > > > >>>
> > > > > > > > > > > > >>
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > Neha
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to