Re: Pluggable Log Compaction Policy

2016-02-04 Thread Bill Warshaw
Becket, I put together a commit with a timestamp-based deletion policy, at https://github.com/apache/kafka/commit/2c51ae3cead99432ebf19f0303f8cc797723c939 Is this a small enough change that you'd be comfortable incorporating it into your work KIP 32, or do I need to open a separate KIP? Thanks, B

Re: Pluggable Log Compaction Policy

2016-02-01 Thread Becket Qin
Hi Bill, The PR is still under review. It might take some more time because it touches a bunch of files. You can watch KAFKA-3025 so once it gets closed you will get email notification. Looking forward to your tool. Thanks, Jiangjie (Becket) Qin On Mon, Feb 1, 2016 at 6:54 AM, Bill Warshaw wro

Re: Pluggable Log Compaction Policy

2016-02-01 Thread Bill Warshaw
Becket, I took a look at KIP-32 and your PR for it. This looks like something that would be great to build off of; I'm envisioning a timestamp-based policy where the client application sets a minimum timestamp, before which everything can be deleted / compacted. How far along is this pull reques

Re: Pluggable Log Compaction Policy

2016-01-31 Thread Gwen Shapira
You should be good to go now. On Sun, Jan 31, 2016 at 10:09 PM, Bill Warshaw wrote: > wdwars...@gmail.com > Bill Warshaw > > Thanks! > > On Mon, Feb 1, 2016 at 1:08 AM, Gwen Shapira wrote: > > > What is your wiki user name? > > > > On Sat, Jan 30, 2016 at 11:35 PM, Bill Warshaw > > wrote: > >

Re: Pluggable Log Compaction Policy

2016-01-31 Thread Bill Warshaw
wdwars...@gmail.com Bill Warshaw Thanks! On Mon, Feb 1, 2016 at 1:08 AM, Gwen Shapira wrote: > What is your wiki user name? > > On Sat, Jan 30, 2016 at 11:35 PM, Bill Warshaw > wrote: > > > Hello again, > > > > Is there anyone on this thread who has admin access to the Kafka > Confluence > > w

Re: Pluggable Log Compaction Policy

2016-01-31 Thread Gwen Shapira
What is your wiki user name? On Sat, Jan 30, 2016 at 11:35 PM, Bill Warshaw wrote: > Hello again, > > Is there anyone on this thread who has admin access to the Kafka Confluence > wiki? I want to create a KIP but I don't have permissions to actually > create a page. > > Bill > > On Fri, Jan 22,

Re: Pluggable Log Compaction Policy

2016-01-30 Thread Bill Warshaw
Hello again, Is there anyone on this thread who has admin access to the Kafka Confluence wiki? I want to create a KIP but I don't have permissions to actually create a page. Bill On Fri, Jan 22, 2016 at 3:29 PM, Guozhang Wang wrote: > Bill, > > Sounds good. If you want to drive pushing this f

Re: Pluggable Log Compaction Policy

2016-01-22 Thread Guozhang Wang
Bill, Sounds good. If you want to drive pushing this feature, you can try to first submit a KIP proposal: https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals This admin command may have some correlations with KIP-4: https://cwiki.apache.org/confluence/display/KAFKA/KIP

Re: Pluggable Log Compaction Policy

2016-01-22 Thread Bill Warshaw
A function such as "deleteUpToOffset(TopicPartition tp, long minOffsetToRetain)" exposed through AdminUtils would be perfect. I would agree that a one-time admin tool would be a good fit for our use case, as long as we can programmatically invoke it. I realize that isn't completely trivial, since

Re: Pluggable Log Compaction Policy

2016-01-21 Thread Becket Qin
I agree with Guozhang that this seems better to be a separate tool. Also, I am wondering if KIP-32 can be used here. We can have a timestamp based compaction policy if needed, for example, keep any message whose timestamp is greater than (MaxTimestamp - 24 hours). Jiangjie (Becket) Qin On Thu,

Re: Pluggable Log Compaction Policy

2016-01-21 Thread Guozhang Wang
Bill, For your case since once the log is cleaned up to the given offset watermark (or threshold, whatever the name is), future cleaning with the same watermark will effectively be a no-op, so I feel your scenario will be better fit as a one-time admin tool to cleanup the logs rather than customiz

Re: Pluggable Log Compaction Policy

2016-01-20 Thread Bill Warshaw
For our particular use case, we would need to. This proposal is really two separate pieces: custom log compaction policy, and the ability to set arbitrary key-value pairs in a Topic configuration. I believe that Kafka's current behavior of throwing errors when it encounters configuration keys th

Re: Pluggable Log Compaction Policy

2016-01-20 Thread Guozhang Wang
So do you need to periodically update the key-value pairs to "advance the threshold for each topic"? Guozhang On Wed, Jan 20, 2016 at 5:51 PM, Bill Warshaw wrote: > Compaction would be performed in the same manner as it is currently. There > is a predicate applied in the "shouldRetainMessage"

Re: Pluggable Log Compaction Policy

2016-01-20 Thread Bill Warshaw
Compaction would be performed in the same manner as it is currently. There is a predicate applied in the "shouldRetainMessage" function in LogCleaner; ultimately we just want to be able to swap a custom implementation of that particular method in. Nothing else in the compaction codepath would nee

Re: Pluggable Log Compaction Policy

2016-01-20 Thread Guozhang Wang
Hello Bill, Just to clarify your use case, is your "log compaction" executed manually, or it is triggered periodically like the current log cleaning by-key does? If it is the latter case, how will you advance the "threshold transaction_id" each time when it executes? Guozhang On Wed, Jan 20, 20

Re: Pluggable Log Compaction Policy

2016-01-20 Thread Bill Warshaw
Damian, I appreciate your quick response. Our transaction_id is incrementing for each transaction, so we will only ever have one message in Kafka with a given transaction_id. We thought about using a rolling counter that is incremented on each checkpoint as the key, and manually triggering compac

Re: Pluggable Log Compaction Policy

2016-01-20 Thread Damian Guy
Hi Bill, Have you looked at: http://kafka.apache.org/documentation.html#compaction It supports deletes so if you keyed by transaction_id you could, in theory, delete them. Cheers, Damian On 20 January 2016 at 18:35, Bill Warshaw wrote: > Hello, > > I'm working on a team that is starting to us