[ https://issues.apache.org/jira/browse/KAFKA-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106996#comment-14106996 ]
Jim Hoagland commented on KAFKA-1489: ------------------------------------- Steven, you can list more than one directory under log.dir; I've used as many as 10 (10 different volumes). Each directory is essentially a different partition. I think the setting be should per-directory and set at the broker level (in server.properties). We do need to think through what actions to take when approaching the limit (disk.full.discard.policy). Reasonable choices to me seem to be: * don't do anything (user doesn't want a limit to be enforced) * scale back the retention policy for each topic by as high a percent as needed to free up a noticeable amount of space (may require trying multiple percentages) * discard least recently used topic (for cases where the topics change over time) * discard least recently used topic, but only if the topic follows a user specified naming pattern * start rejecting new messages (for cases where the limit is a hard limit but where it is not acceptable to discard data early without a human in the loop); this is a fallback case as well and what we should do when a volume being written to is nearly full It would be more work to set up so may not be worth it, but perhaps this could be plug-able (user can select what class to use when hitting the limit). > Global threshold on data retention size > --------------------------------------- > > Key: KAFKA-1489 > URL: https://issues.apache.org/jira/browse/KAFKA-1489 > Project: Kafka > Issue Type: New Feature > Components: log > Affects Versions: 0.8.1.1 > Reporter: Andras Sereny > Assignee: Jay Kreps > Labels: newbie > > Currently, Kafka has per topic settings to control the size of one single log > (log.retention.bytes). With lots of topics of different volume and as they > grow in number, it could become tedious to maintain topic level settings > applying to a single log. > Often, a chunk of disk space is dedicated to Kafka that hosts all logs > stored, so it'd make sense to have a configurable threshold to control how > much space *all* data in one Kafka log data directory can take up. > See also: > http://mail-archives.apache.org/mod_mbox/kafka-users/201406.mbox/browser > http://mail-archives.apache.org/mod_mbox/kafka-users/201311.mbox/%3c20131107015125.gc9...@jkoshy-ld.linkedin.biz%3E -- This message was sent by Atlassian JIRA (v6.2#6252)