Lucas Brutschy created KAFKA-16086:
--------------------------------------

             Summary: Kafka Streams has RocksDB native memory leak
                 Key: KAFKA-16086
                 URL: https://issues.apache.org/jira/browse/KAFKA-16086
             Project: Kafka
          Issue Type: Bug
          Components: streams
            Reporter: Lucas Brutschy
         Attachments: image.png

The current 3.7 and trunk versions are leaking native memory while running 
Kafka streams over several hours. This will likely kill any real workload over 
time, so this should be treated as a blocker bug for 3.7.

This is discovered in a long-running soak test. Attached is the memory 
consumption, which steadily approaches 100% and then the JVM is killed.

Rerunning the same test with jemalloc native memory profiling, we see these 
allocated objects after a few hours:
 
{noformat}
(jeprof) top
Total: 13283138973 B
10296829713 77.5% 77.5% 10296829713 77.5% rocksdb::port::cacheline_aligned_alloc
2487325671 18.7% 96.2% 2487325671 18.7% rocksdb::BlockFetcher::ReadBlockContents
150937547 1.1% 97.4% 150937547 1.1% 
rocksdb::lru_cache::LRUHandleTable::LRUHandleTable
119591613 0.9% 98.3% 119591613 0.9% prof_backtrace_impl
47331433 0.4% 98.6% 105040933 0.8% rocksdb::BlockBasedTable::PutDataBlockToCache
32516797 0.2% 98.9% 32516797 0.2% rocksdb::Arena::AllocateNewBlock
29796095 0.2% 99.1% 30451535 0.2% Java_org_rocksdb_Options_newOptions
18172716 0.1% 99.2% 20008397 0.2% rocksdb::InternalStats::InternalStats
16032145 0.1% 99.4% 16032145 0.1% rocksdb::ColumnFamilyDescriptorJni::construct
12454120 0.1% 99.5% 12454120 0.1% std::_Rb_tree::_M_insert_unique{noformat}
 

 

The first hypothesis is that this is caused by the leaking `Options` object 
introduced in this line:

 

[https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/state/internals/RocksDBStore.java#L312|https://github.com/apache/kafka/pull/14852]

 

Introduced in this PR: 
[https://github.com/apache/kafka/pull/14852|https://github.com/apache/kafka/pull/14852]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to