[jira] [Created] (KAFKA-3933) Kafka OOM During Log Recovery Due to Leaked Native Memory

Tom Crayford (JIRA) Thu, 07 Jul 2016 06:33:47 -0700

Tom Crayford created KAFKA-3933:
-----------------------------------

             Summary: Kafka OOM During Log Recovery Due to Leaked Native Memory
                 Key: KAFKA-3933
                 URL: https://issues.apache.org/jira/browse/KAFKA-3933
             Project: Kafka
          Issue Type: Bug
          Components: log
    Affects Versions: 0.10.0.0, 0.9.0.1, 0.8.2.2
         Environment: Linux, latest oracle java-8
            Reporter: Tom Crayford

Hi there. We've been tracking an issue where Kafka hits an
java.lang.OutOfMemoryError during log recovery.
After a bunch of tracking work, we've realized we've hit an instance of a long
known issue: http://www.evanjones.ca/java-native-leak-bug.html

TLDR: Kafka breaks the rule "Always close GZIPInputStream and GZIPOutputStream
since they use native memory via zlib" from that article.

As such, during broker startup, when you're recovering log segments that have
been compressed with gzip, Kafka leaks `GZIPInputStream` all over the place.
Our crashes during startup have this profile - the JVM heap is empty (a few
hundred MB), but the offheap memory is full of allocations caused by
`Java_java_util_zip_Deflater_init` and `deflatInit2`.
This leads to broker crashes during startup. The only real mitigation is having
*far* more memory than you need to boot (which I'd guess is why folk haven't
noticed this in production that much yet).

To dig into the code more (this is based on trunk). Log recovery on unflushed
segments eventually calls `LogSegment.recover`:
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/LogSegment.scala#L172

On compressed segments, that leads to a call to `deepIterator`:
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/LogSegment.scala#L189

That leads to a call to `CompressionFactory`:
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/message/ByteBufferMessageSet.scala#L95
which creates a `GZIPInputStream`:
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/message/CompressionFactory.scala#L46

That `GZIPInputStream` is never closed anywhere, and the low *heap* pressure
means that the finalizer on `GZIPInputStream` that deallocates the native
buffers is never called, because GC is never triggered. Instead, we just
exhaust the offheap memory and then Kafka dies from an OutOfMemory error.

Kafka *does* trigger an `inputstream.close()` call, but only when *fully*
reading the whole input stream (see
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/message/ByteBufferMessageSet.scala#L156).
When it's performing log recovery, in
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/LogSegment.scala#L189
it doesn't read to the end of the stream, but instead reads the first offset
and leaves things alone.

This issue likely impacts `lz4` and `snappy` compressed topics in exactly the
same way. I think (but haven't 100% verified) that it impacts all versions of
Kafka that are supported (0.8 -> 0.10).

Fixing this seems relatively annoying, but only because of some "small matters
of coding", nothing hugely problematic.

The main issue is that `deepIterator` only returns an `Iterator`, which doesn't
have a `close()` method of any kind. We could create a new `ClosableIterator`
trait and have it extend Java's `AutoCloseable`
(https://docs.oracle.com/javase/7/docs/api/java/lang/AutoCloseable.html), then
explicitly call `close()` everywhere we use a `deepIterator()` and don't always
read to the end. Scala unfortunately doesn't seem to have a built in version of
Java's `try-with-resources` statement, but we can explicitly call close
everywhere perfectly happily.

Another (but much more hacky) solution would be to always read to the end of
the iterator in `LogSegment.recover`, but that seems pretty bad, using far more
resources than is needed during recovery.

I can't think of any other reasonable solutions for now, but would love to hear
input from the community.

We're happy doing the work of developing a patch, but thought we'd report the
issue before starting down that path.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (KAFKA-3933) Kafka OOM During Log Recovery Due to Leaked Native Memory

Reply via email to