[ 
https://issues.apache.org/jira/browse/KAFKA-2580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14937017#comment-14937017
 ] 

Grant Henke commented on KAFKA-2580:
------------------------------------

A few notes/questions from my initial look at the LogManager:
- All logs are loaded and (if needed) recovered at start up. When loading the 
logs all segments are loaded and if indexes are corrupted they are rebuilt. If 
we didn't load all logs and segments at startup, there would be less eager 
recovery/rebuild. Is it okay if we do this lazily? Otherwise we may need to 
"roll" through the segments iteratively to keep the open file count down.
- Does it make sense to have a configuration to limit the number of open 
segments to a hard value? We could then use a LRU like file handle cache as 
Joel mentioned. However, there may be scenarios where having a hard limit 
causes a lot of churn closing and reopening files. Perhaps having some defined 
timeout based on last access/use could work too?

> Kafka Broker keeps file handles open for all log files (even if its not 
> written to/read from)
> ---------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-2580
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2580
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8.2.1
>            Reporter: Vinoth Chandar
>
> We noticed this in one of our clusters where we stage logs for a longer 
> amount of time. It appears that the Kafka broker keeps file handles open even 
> for non active (not written to or read from) files. (in fact, there are some 
> threads going back to 2013 
> http://grokbase.com/t/kafka/users/132p65qwcn/keeping-logs-forever) 
> Needless to say, this is a problem and forces us to either artificially bump 
> up ulimit (its already at 100K) or expand the cluster (even if we have 
> sufficient IO and everything). 
> Filing this ticket, since I could find anything similar. Very interested to 
> know if there are plans to address this (given how Samza's changelog topic is 
> meant to be a persistent large state use case).  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to