[ https://issues.apache.org/jira/browse/KAFKA-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14107844#comment-14107844 ]
Jay Kreps commented on KAFKA-1489: ---------------------------------- I agree I think generally it makes most sense to consider dropping from the end rather than rejecting new messages. If you buy that then you can think about this feature as being about how to choose which partition to drop the last segment from when you are over your space allocation. The two obvious ways would be (a) drop the oldest segment amongst all logs or (b) drop from the partition which is taking up the most space. However Jim points out the case that make these slightly confusing: you can have different retention settings by space and time for each topic. So if you have one topic which has retention 30 days and one topic with retention 1 day then this emergency discard would always discard from the 30 day topic. Jim's alternative actually makes some sense--assume all topics are in steady state (i.e. up against their maximum retention be it size or time). Then you can just discard (say) 10% across the board. So if that were the case I think the only config you need is something like max.total.disk.space.bytes=12345 and we can probably just hard code the 10% discard when you hit this limit. > Global threshold on data retention size > --------------------------------------- > > Key: KAFKA-1489 > URL: https://issues.apache.org/jira/browse/KAFKA-1489 > Project: Kafka > Issue Type: New Feature > Components: log > Affects Versions: 0.8.1.1 > Reporter: Andras Sereny > Assignee: Jay Kreps > Labels: newbie > > Currently, Kafka has per topic settings to control the size of one single log > (log.retention.bytes). With lots of topics of different volume and as they > grow in number, it could become tedious to maintain topic level settings > applying to a single log. > Often, a chunk of disk space is dedicated to Kafka that hosts all logs > stored, so it'd make sense to have a configurable threshold to control how > much space *all* data in one Kafka log data directory can take up. > See also: > http://mail-archives.apache.org/mod_mbox/kafka-users/201406.mbox/browser > http://mail-archives.apache.org/mod_mbox/kafka-users/201311.mbox/%3c20131107015125.gc9...@jkoshy-ld.linkedin.biz%3E -- This message was sent by Atlassian JIRA (v6.2#6252)