log.retention.size controls the total size in a log dir (per partition). log.file.size controls the size of each log segment in the log dir.
Thanks, Jun On Thu, May 1, 2014 at 9:31 PM, vinh <v...@loggly.com> wrote: > In the 0.7 docs, the description for log.retention.size and log.file.size > sound very much the same. In particular, that they apply to a single log > file (or log segment file). > > http://kafka.apache.org/07/configuration.html > > I'm beginning to think there is no setting to control the max aggregate > size of all logs. If this is correct, what would be a good approach to > enforce this requirement? In my particular scenario, I have a lot of data > being written to Kafka at a very high rate. So a 1TB disk can easily be > filled up in 24hrs or so. One option is to add more Kafka brokers to add > more disk space to the pool, but I'd like to avoid that and see if I can > simply configure Kafka to not write more than 1TB aggregate. Else, Kafka > will OOM and kill itself, and possibly the crash the node itself because > the disk is full. > > > On May 1, 2014, at 9:21 PM, vinh <v...@loggly.com> wrote: > > > Using Kafka 0.7.2, I have the following in server.properties: > > > > log.retention.hours=48 > > log.retention.size=107374182400 > > log.file.size=536870912 > > > > My interpretation of this is: > > a) a single log segment file over 48hrs old will be deleted > > b) the total combined size of *all* logs is 100GB > > c) a single log segment file is limited to 500MB in size before a new > segment file is spawned spawning a new segment file > > d) a "log file" can be composed of many "log segment files" > > > > But, even after setting the above, I find that the total combined size > of all Kafka logs on disk is 200GB right now. Isn't log.retention.size > supposed to limit it to 100GB? Am I missing something? The docs are not > really clear, especially when it comes to distinguishing between a "log > file" and a "log segment file". > > > > I have disk monitoring. But like anything else in software, even > monitoring can fail. Via configuration, I'd like to make sure that Kafka > does not write more than the available disk space. Or something like > log4j, where I can set a max number of log files and the max size per file, > which essentially allows me to set a max aggregate size limit across all > logs. > > > > Thanks, > > -Vinh > >