Hi Xiongqi,

Thanks for this KIP.

The name seems a bit ambiguous.  Our compaction policies are already 
time-based, after all.  It seems like this change is focused around adding a 
“max.compaction.lag.ms."  Perhaps the KIP title should be something like "add 
maximum compaction lag time"?

> The active segment is forced to roll when either "max.compaction.lag.ms"
> or "segment.ms" (log.roll.ms and log.roll.hours) has reached.

If the max.compaction.lag.ms is low, it seems like segments will be rolled very 
frequently.  This can be a source of problems in the cluster, since creating 
many different small log segments consumes a huge amount of cluster resources.  
Therefore, I would suggest adding a broker-level configuration which allows us 
to set a minimum value for max.compaction.lag.ms.  If we let users set it on a 
per-topic basis, someone could set a value of 1 ms or something, and cause 
chaos.

 > -- Note that an alternative configuration is to use -1 as "disabled" and 0 
 > as "immediate compaction". Because compaction lag is still determined 
 > based on min.compaction.lag and how long to roll an active segment,  the 
 > actual lag for compaction is undetermined if we use "0".  On the other 
 > hand, we can already set "min.cleanable.dirty.ratio" to achieve the same 
 > goal.  So here we choose "0" as "disabled".

I would prefer -1 to be the invalid setting.  Treating 0 differently than 1 
seems strange to me.

best,
Colin


On Tue, Sep 4, 2018, at 15:04, xiongqi wu wrote:
> Let's VOTE for this KIP.
> KIP:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-354
> %3A+Time-based+log+compaction+policy
> 
> Implementation:
> 
> https://github.com/apache/kafka/pull/5611
> 
> 
> 
> Xiongqi (Wesley) Wu

Reply via email to