Peter, Wesley, thanks for your use cases.

There is a KIP discussion about adding a timestamp based log deletion
policy into Kafka along side with compaction; and I'm thinking whether it
makes sense to enable both log deletion and log compaction for the general
cases of changelog data with expirations. Please take a look at the wiki
page and discussion thread and feel free to leave your comments on the
email threads if you feel it could possibly fit your needs.

https://cwiki.apache.org/confluence/display/KAFKA/KIP-47+-+Add+timestamp-based+log+deletion+policy

https://www.mail-archive.com/dev@kafka.apache.org/msg49573.html

Guozhang

On Fri, May 13, 2016 at 6:52 AM, Wesley Chow <w...@chartbeat.com> wrote:

> Yes, also classic caching, where you might use memcache with TTLs.
>
> But a different use case for us is sessionizing. We push a high rate of
> updates coming from a browser session to our Kafka cluster. If we don’t see
> an update for a particular session after some period of time, we say that
> session has expired and want to delete it. Compacted logs seem great for
> this, however without TTLs we’d have to consume these updates to figure out
> when to expire the session. I can go into more detail if that’s not clear.
>
> The general case here is that sometimes you want a kv store that doesn’t
> exceed some resource bound. In the case of caching, you may not want to
> exceed some time bound, but you may also not want to exceed some space
> bound. You can totally deal with these bounds with a consumer, but if the
> rate of updates to the keys is high then this could be an expensive
> proposition. In the case of my sessionizing problem, consuming that data to
> deal with expirations can easily add tens of thousands of dollars in
> inter-AZ costs per year (not to mention the servers to run the extra
> consumers), so having it taken care of in the brokers is actually very
> useful.
>
> Wes
>
>
> > On May 12, 2016, at 8:25 PM, Peter Davis <davi...@gmail.com> wrote:
> >
> > One use case is implementing a data retention policy.
> >
> > -Peter
> >
> >
> >> On May 12, 2016, at 17:11, Guozhang Wang <wangg...@gmail.com> wrote:
> >>
> >> Wesley,
> >>
> >> Could describe your use case a bit more for motivating this? Is your
> data
> >> source expiring records and hence you want to auto "delete" the
> >> corresponding Kafka records as well?
> >>
> >> Guozhang
> >>
> >>> On Thu, May 12, 2016 at 2:35 PM, Wesley Chow <w...@chartbeat.com>
> wrote:
> >>>
> >>> Right, I’m trying to avoid explicitly managing TTLs. It’s nice being
> able
> >>> to just produce keys into Kafka without having an accompanying vacuum
> >>> consumer.
> >>>
> >>> Wes
> >>>
> >>>
> >>>> On May 12, 2016, at 5:15 PM, Benjamin Manns <benma...@gmail.com>
> wrote:
> >>>>
> >>>> If you send a NULL value to a compacted log, after the retention
> period
> >>> it
> >>>> will be removed. You could run a process that reprocesses the log and
> >>> sends
> >>>> a NULL to keys you want to purge based on some custom logic.
> >>>>
> >>>> On Thu, May 12, 2016 at 2:01 PM, Wesley Chow <w...@chartbeat.com>
> wrote:
> >>>>
> >>>>> Are there any thoughts on supporting TTLs on keys in compacted logs?
> In
> >>>>> other words, some way to set on a per-key basis a time to
> auto-delete.
> >>>>>
> >>>>> Wes
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Benjamin Manns
> >>>> benma...@gmail.com
> >>>> (434) 321-8324
> >>
> >>
> >> --
> >> -- Guozhang
>
>


-- 
-- Guozhang

Reply via email to