[ https://issues.apache.org/jira/browse/KAFKA-3224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bill Warshaw updated KAFKA-3224: -------------------------------- Description: One of Kafka's officially-described use cases is a distributed commit log (http://kafka.apache.org/documentation.html#uses_commitlog). In this case, for a distributed service that needed a commit log, there would be a topic with a single partition to guarantee log order. This service would use the commit log to re-sync failed nodes. Kafka is generally an excellent fit for such a system, but it does not expose an adequate mechanism for log cleanup in such a case. With a distributed commit log, data can only be deleted when the client application determines that it is no longer needed; this creates completely arbitrary ranges of time and size for messages, which the existing cleanup mechanisms can't handle smoothly. A new deletion policy based on the absolute timestamp of a message would work perfectly for this case. The client application will periodically update the minimum timestamp of messages to retain, and Kafka will delete all messages earlier than that timestamp using the existing log cleaner thread mechanism. This is based off of the work being done in KIP-32 - Add timestamps to Kafka message. h3. Initial Approach https://github.com/apache/kafka/commit/2c51ae3cead99432ebf19f0303f8cc797723c939 was: One of Kafka's officially-described use cases is a distributed commit log (http://kafka.apache.org/documentation.html#uses_commitlog). In this case, for a distributed service that needed a commit log, there would be a topic with a single partition to guarantee log order. This service would use the commit log to re-sync failed nodes. Kafka is generally an excellent fit for such a system, but it does not expose an adequate mechanism for log cleanup in such a case. With a distributed commit log, data can only be deleted when the client application determines that it is no longer needed; this creates completely arbitrary ranges of time and size for messages, which the existing cleanup mechanisms can't handle smoothly. A new deletion policy based on the absolute timestamp of a message would work perfectly for this case. The client application will periodically update the minimum timestamp of messages to retain, and Kafka will delete all messages earlier than that timestamp using the existing log cleaner thread mechanism. This is based off of the work being done in KIP-32 - Add timestamps to Kafka message. > Add timestamp-based log deletion policy > --------------------------------------- > > Key: KAFKA-3224 > URL: https://issues.apache.org/jira/browse/KAFKA-3224 > Project: Kafka > Issue Type: Improvement > Reporter: Bill Warshaw > Labels: kafka > > One of Kafka's officially-described use cases is a distributed commit log > (http://kafka.apache.org/documentation.html#uses_commitlog). In this case, > for a distributed service that needed a commit log, there would be a topic > with a single partition to guarantee log order. This service would use the > commit log to re-sync failed nodes. Kafka is generally an excellent fit for > such a system, but it does not expose an adequate mechanism for log cleanup > in such a case. With a distributed commit log, data can only be deleted when > the client application determines that it is no longer needed; this creates > completely arbitrary ranges of time and size for messages, which the existing > cleanup mechanisms can't handle smoothly. > A new deletion policy based on the absolute timestamp of a message would work > perfectly for this case. The client application will periodically update the > minimum timestamp of messages to retain, and Kafka will delete all messages > earlier than that timestamp using the existing log cleaner thread mechanism. > This is based off of the work being done in KIP-32 - Add timestamps to Kafka > message. > h3. Initial Approach > https://github.com/apache/kafka/commit/2c51ae3cead99432ebf19f0303f8cc797723c939 -- This message was sent by Atlassian JIRA (v6.3.4#6332)