[ https://issues.apache.org/jira/browse/KAFKA-3224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ray Chiang updated KAFKA-3224: ------------------------------ Component/s: log > Add timestamp-based log deletion policy > --------------------------------------- > > Key: KAFKA-3224 > URL: https://issues.apache.org/jira/browse/KAFKA-3224 > Project: Kafka > Issue Type: Improvement > Components: log > Reporter: Bill Warshaw > Priority: Major > Labels: kafka > > One of Kafka's officially-described use cases is a distributed commit log > (http://kafka.apache.org/documentation.html#uses_commitlog). In this case, > for a distributed service that needed a commit log, there would be a topic > with a single partition to guarantee log order. This service would use the > commit log to re-sync failed nodes. Kafka is generally an excellent fit for > such a system, but it does not expose an adequate mechanism for log cleanup > in such a case. With a distributed commit log, data can only be deleted when > the client application determines that it is no longer needed; this creates > completely arbitrary ranges of time and size for messages, which the existing > cleanup mechanisms can't handle smoothly. > A new deletion policy based on the absolute timestamp of a message would work > perfectly for this case. The client application will periodically update the > minimum timestamp of messages to retain, and Kafka will delete all messages > earlier than that timestamp using the existing log cleaner thread mechanism. > This is based off of the work being done in KIP-32 - Add timestamps to Kafka > message. > h3. Initial Approach > https://github.com/apache/kafka/compare/trunk...bill-warshaw:KAFKA-3224 -- This message was sent by Atlassian JIRA (v7.6.3#76005)