[ 
https://issues.apache.org/jira/browse/FLINK-39923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18089081#comment-18089081
 ] 

Keith Lee edited comment on FLINK-39923 at 6/15/26 10:31 AM:
-------------------------------------------------------------

The root cause(s) of this leak is as follow:
 * Latent Flink-side bug was introduced in FLINK-24786 when Statistics object 
was added without explicit close() on the object. This was latent as it relied 
on finalize() running to call dispose() and close().
 * rocksdb side finalizer was removed in 
[https://github.com/facebook/rocksdb/commit/99d86252b6514d0fe3b848bd39bda94642c14faf]
 * Flink 2.0+ uses frocksdb 8.10.0. Leak started occurring as close is no 
longer called.


was (Author: JIRAUSER312715):
The root cause(s) of this leak is as follow:

1. Latent Flink-side bug was introduced in FLINK-24786 when Statistics object 
was added without explicit close() on the object. This was latent as it relied 
on finalize() running to call dispose() and close().
2. rocksdb side finalizer was removed in 
[https://github.com/facebook/rocksdb/pull/9523]
3. Flink 2.0+ uses frocksdb 8.10.0. Leak started occurring as close is no 
longer called.

> RocksDB Statistics native memory leaks on state backend rebuild when ticker 
> metrics are enabled
> -----------------------------------------------------------------------------------------------
>
>                 Key: FLINK-39923
>                 URL: https://issues.apache.org/jira/browse/FLINK-39923
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / State Backends
>    Affects Versions: 2.0.2, 2.2.1, 2.1.3
>            Reporter: Keith Lee
>            Priority: Major
>
> When any of the 11 RocksDB ticker-type metric options is enabled, the 
> TaskManager leaks native memory in proportion to the number of keyed state 
> backend rebuilds (job restarts, rescaling, recovery cascades).
> Ticker type metric:
> {quote}state.backend.rocksdb.metrics.block-cache-hit
> state.backend.rocksdb.metrics.block-cache-miss
> state.backend.rocksdb.metrics.bloom-filter-useful
> state.backend.rocksdb.metrics.bloom-filter-full-positive
> state.backend.rocksdb.metrics.bloom-filter-full-true-positive
> state.backend.rocksdb.metrics.bytes-read
> state.backend.rocksdb.metrics.iter-bytes-read
> state.backend.rocksdb.metrics.bytes-written
> state.backend.rocksdb.metrics.compaction-read-bytes
> state.backend.rocksdb.metrics.compaction-write-bytes
> state.backend.rocksdb.metrics.stall-micros
> {quote}
> This issue was reproduced and confirmed as OOMKill was observed within 80 
> seconds of submitting a continuously failing job to Flink cluster configured 
> with low restart delay and ticker style metrics enabled. See here for 
> reproduction instructions and scripts: 
> [https://github.com/leekeiabstraction/flink/tree/reproduce-rocksdb-statistics-leak/reproduce-rocksdb-statistics-leak]
> See dotfile output of jeprof (jemalloc profiling needs to be enabled) points 
> to 770MB memory allocated in rocksdb StatisticsJni.
>  
> {quote}Legend 
> [shape=box,fontsize=24,shape=plaintext,label="/proc/307/exe\lTotal B: 
> 2855914662\lFocusing on: 2855914662\lDropped nodes with <= 
> [14279573|tel:14279573] abs(B)\lDropped edges with <= [2855914|tel:2855914] 
> B\l"];
> N1 [label="je_prof_backtrace\n0 (0.0%)\rof [2040910591|tel:2040910591] 
> (71.5%)\r",shape=box,fontsize=8.0];
> N2 [label="je_prof_tctx_create\n0 (0.0%)\rof [2040910591|tel:2040910591] 
> (71.5%)\r",shape=box,fontsize=8.0];
> N3 [label="prof_backtrace_impl\n2040910591 
> (71.5%)\r",shape=box,fontsize=50.3];
> N4 [label="je_malloc_default\n0 (0.0%)\rof [2032208910|tel:2032208910] 
> (71.2%)\r",shape=box,fontsize=8.0];
> N5 [label="Unsafe_AllocateMemory0\n0 (0.0%)\rof [1874666648|tel:1874666648] 
> (65.6%)\r",shape=box,fontsize=8.0];
> N6 [label="os\nmalloc@d01a60\n0 (0.0%)\rof [1874666648|tel:1874666648] 
> (65.6%)\r",shape=box,fontsize=8.0];
> N7 [label="0x00007fb705ffd460\n0 (0.0%)\rof [1874578289|tel:1874578289] 
> (65.6%)\r",shape=box,fontsize=8.0];
> N8 [label="Java_org_rocksdb_Statistics_newStatistics___3BJ\n0 (0.0%)\rof 
> [807469136|tel:807469136](28.3%)\r",shape=box,fontsize=8.0];
> N9 [label="rocksdb\nCoreLocalArray\nCoreLocalArray\n0 (0.0%)\rof 
> [807403520|tel:807403520] (28.3%)\r",shape=box,fontsize=8.0];
> N10 [label="rocksdb\nStatisticsImpl\nStatisticsImpl\n0 (0.0%)\rof 
> [807403520|tel:807403520] (28.3%)\r",shape=box,fontsize=8.0];
> N11 [label="rocksdb\nStatisticsJni\nStatisticsJni\n0 (0.0%)\rof 
> [807403520|tel:807403520] (28.3%)\r",shape=box,fontsize=8.0];
> N12 [label="rocksdb\nport\ncacheline_aligned_alloc\n807403520 
> (28.3%)\r",shape=box,fontsize=34.6];
> N13 [label="0x00007fb7068b4ceb\n0 (0.0%)\rof [578879568|tel:578879568] 
> (20.3%)\r",shape=box,fontsize=8.0];
> {quote}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to