[ 
https://issues.apache.org/jira/browse/KAFKA-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106996#comment-14106996
 ] 

Jim Hoagland commented on KAFKA-1489:
-------------------------------------

Steven, you can list more than one directory under log.dir; I've used as many 
as 10 (10 different volumes).  Each directory is essentially a different 
partition.

I think the setting be should per-directory and set at the broker level (in 
server.properties).

We do need to think through what actions to take when approaching the limit 
(disk.full.discard.policy).  Reasonable choices to me seem to be:
* don't do anything (user doesn't want a limit to be enforced)
* scale back the retention policy for each topic by as high a percent as needed 
to free up a noticeable amount of space (may require trying multiple 
percentages)
* discard least recently used topic (for cases where the topics change over 
time)
* discard least recently used topic, but only if the topic follows a user 
specified naming pattern
* start rejecting new messages (for cases where the limit is a hard limit but 
where it is not acceptable to discard data early without a human in the loop); 
this is a fallback case as well and what we should do when a volume being 
written to is nearly full

It would be more work to set up so may not be worth it, but perhaps this could 
be plug-able (user can select what class to use when hitting the limit).

> Global threshold on data retention size
> ---------------------------------------
>
>                 Key: KAFKA-1489
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1489
>             Project: Kafka
>          Issue Type: New Feature
>          Components: log
>    Affects Versions: 0.8.1.1
>            Reporter: Andras Sereny
>            Assignee: Jay Kreps
>              Labels: newbie
>
> Currently, Kafka has per topic settings to control the size of one single log 
> (log.retention.bytes). With lots of topics of different volume and as they 
> grow in number, it could become tedious to maintain topic level settings 
> applying to a single log. 
> Often, a chunk of disk space is dedicated to Kafka that hosts all logs 
> stored, so it'd make sense to have a configurable threshold to control how 
> much space *all* data in one Kafka log data directory can take up.
> See also:
> http://mail-archives.apache.org/mod_mbox/kafka-users/201406.mbox/browser
> http://mail-archives.apache.org/mod_mbox/kafka-users/201311.mbox/%3c20131107015125.gc9...@jkoshy-ld.linkedin.biz%3E



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to