[ https://issues.apache.org/jira/browse/KAFKA-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552653#comment-14552653 ]
Jay Kreps commented on KAFKA-2201: ---------------------------------- The comment that you see a lot of .deleted files in lsof is a little worrying. Can you confirm that this you are first running ls then running lsof so this isn't just a timing thing (i.e. the files are deleted after the lsof)? What is the expected number of files given the number of partitions you have? Since you have configured an extremely small segment size of 1MB and a retention of 1GB this means you should have ~2000 files per partition (not counting pending deletes), so if you have more than 65/2 partitions we would expect it to fail. How many partitions do you have? It's worth noting the fd count includes connections to so just counting files won't get you quite there. > Open file handle leak > --------------------- > > Key: KAFKA-2201 > URL: https://issues.apache.org/jira/browse/KAFKA-2201 > Project: Kafka > Issue Type: Bug > Components: core > Affects Versions: 0.8.2.1 > Environment: Debian Linux 7, 64 bit > Oracle JDK 1.7.0u40, 64-bit > Reporter: Albert Visagie > > The kafka broker crashes with the following stack trace from the server.log > roughly every 18 hours: > [2015-05-19 07:39:22,924] FATAL [KafkaApi-0] Halting due to unrecoverable I/O > error while handling produce request: (kafka.server.KafkaApis) > kafka.common.KafkaStorageException: I/O exception in append to log 'nnnnnnn-1' > at kafka.log.Log.append(Log.scala:266) > at > kafka.cluster.Partition$$anonfun$appendMessagesToLeader$1.apply(Partition.scala:379) > at > kafka.cluster.Partition$$anonfun$appendMessagesToLeader$1.apply(Partition.scala:365) > at kafka.utils.Utils$.inLock(Utils.scala:535) > at kafka.utils.Utils$.inReadLock(Utils.scala:541) > at kafka.cluster.Partition.appendMessagesToLeader(Partition.scala:365) > at > kafka.server.KafkaApis$$anonfun$appendToLocalLog$2.apply(KafkaApis.scala:291) > at > kafka.server.KafkaApis$$anonfun$appendToLocalLog$2.apply(KafkaApis.scala:282) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) > at > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) > at > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) > at > scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) > at scala.collection.mutable.HashMap.foreach(HashMap.scala:98) > at > scala.collection.TraversableLike$class.map(TraversableLike.scala:244) > at scala.collection.AbstractTraversable.map(Traversable.scala:105) > at kafka.server.KafkaApis.appendToLocalLog(KafkaApis.scala:282) > at > kafka.server.KafkaApis.handleProducerOrOffsetCommitRequest(KafkaApis.scala:204) > at kafka.server.KafkaApis.handle(KafkaApis.scala:59) > at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:59) > at java.lang.Thread.run(Thread.java:724) > Caused by: java.io.IOException: Map failed > at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:888) > at > kafka.log.OffsetIndex$$anonfun$resize$1.apply(OffsetIndex.scala:286) > at > kafka.log.OffsetIndex$$anonfun$resize$1.apply(OffsetIndex.scala:276) > at kafka.utils.Utils$.inLock(Utils.scala:535) > at kafka.log.OffsetIndex.resize(OffsetIndex.scala:276) > at > kafka.log.OffsetIndex$$anonfun$trimToValidSize$1.apply$mcV$sp(OffsetIndex.scala:265) > at > kafka.log.OffsetIndex$$anonfun$trimToValidSize$1.apply(OffsetIndex.scala:265) > at > kafka.log.OffsetIndex$$anonfun$trimToValidSize$1.apply(OffsetIndex.scala:265) > at kafka.utils.Utils$.inLock(Utils.scala:535) > at kafka.log.OffsetIndex.trimToValidSize(OffsetIndex.scala:264) > at kafka.log.Log.roll(Log.scala:563) > at kafka.log.Log.maybeRoll(Log.scala:539) > at kafka.log.Log.append(Log.scala:306) > ... 21 more > Caused by: java.lang.OutOfMemoryError: Map failed > at sun.nio.ch.FileChannelImpl.map0(Native Method) > at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:885) > ... 33 more > The Kafka broker's open filehandles as seen by > lsof | grep pid | wc -l > grows steadily as it runs. Under our load it lasts about 18 hours before > crashing with the stack trace above. > We were experimenting with settings under Log Retention Policy in > server.properties: > log.retention.hours=168 > log.retention.bytes=107374182 > log.segment.bytes=1073741 > log.retention.check.interval.ms=3000 > The result is that the broker rolls over segments quite rapidly. We don't > have to run it that way of course. > We are running only one broker at the moment. > lsof shows many open files without size and absent from ls in the log > directory with the suffix ".deleted" > This is kafka 0.8.2.1 with scala 2.10.4 as downloaded from the website last > week. -- This message was sent by Atlassian JIRA (v6.3.4#6332)