Jeff,

The discussion thread from a while back on KIP-58 has some discussion around 
"log.cleaner.min.cleanable.ratio".

KIP-58 page: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-58+-+Make+Log+Compaction+Point+Configurable
Discussion thread (linked off that page): 
http://mail-archives.apache.org/mod_mbox/kafka-dev/201605.mbox/%3CCAAWiU2VzPXdK1fW3FacfDsVQc-1sphNMjEqtkRSHZEYaN1Wr-w%40mail.gmail.com%3E

The summary is that "log.cleaner.min.cleanable.ratio" seems like it was 
designed to limit how much disk I/O to spend on compaction. Your JIRA indicates 
you benchmarked CPU and memory, but did you look at disk I/O?

-James

> On Jul 7, 2017, at 1:24 PM, Jeff Chao <jc...@heroku.com> wrote:
> 
> Hi,
> 
> I filed a jira a few weeks ago around some log compaction ratio behavior we
> were seeing. Now that the 0.11 vote done and release is out, I wanted to
> follow up on it. Jira is here:
> https://issues.apache.org/jira/projects/KAFKA/issues/KAFKA-5452.
> 
> Details are in the jira, but long story short, after much testing, we were
> seeing that aggressive log compaction ratios were performing just as well
> as more conservative ratios. Fundamentally I would expect there to be some
> sort of hit, but seeing that the data shows there wasn't, we wanted to
> raise this to the rest of the community and see if anyone else has observed
> similar behavior. The motivation behind this is to see if we might consider
> changing the default from 0.5. This could help in preventing confusion
> around duplicate keys in low volume log-compacted topics use cases.
> 
> Thanks,
> 
> Jeff Chao
> Heroku Kafka

Reply via email to