[ https://issues.apache.org/jira/browse/FLINK-31089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17690329#comment-17690329 ]
xiaogang zhou edited comment on FLINK-31089 at 2/17/23 10:34 AM: ----------------------------------------------------------------- [~yunta] thx some background info jemalloc version: I updated the jemalloc version from 3.6.0-11 to 5.0.1 the first set of data I collected is setPinL0FilterAndIndexBlocksInCache false, and set the flink kafka offset to 2days ago. I saw Total: 2632053083 B 1693036063 64.3% 64.3% 1693036063 64.3% rocksdb::UncompressBlockContentsForCompressionType 684855869 26.0% 90.3% 684855869 26.0% os::malloc@90ca90 122444085 4.7% 95.0% 122444085 4.7% os::malloc@90cc30 50331648 1.9% 96.9% 50331648 1.9% init 41957115 1.6% 98.5% 41957115 1.6% rocksdb::Arena::AllocateNewBlock 15729360 0.6% 99.1% 18924496 0.7% rocksdb::LRUCacheShard::Insert 8388928 0.3% 99.4% 1701424991 64.6% rocksdb::BlockBasedTable::ReadFilter 4194432 0.2% 99.6% 4194432 0.2% std::string::_Rep::_S_create 3704419 0.1% 99.7% 3704419 0.1% readCEN 3195135 0.1% 99.8% 3195135 0.1% rocksdb::LRUHandleTable::Resize 2098816 0.1% 99.9% 2098816 0.1% std::vector::vector 1065045 0.0% 100.0% 1065045 0.0% updatewindow 1052164 0.0% 100.0% 1052164 0.0% inflateInit2_ 0 0.0% 100.0% 87035806 3.3% 0x00007fa8f4f8b366 0 0.0% 100.0% 1053704 0.0% 0x00007fa8f4f97f59 0 0.0% 100.0% 1053704 0.0% 0x00007fa8f4f97f67 is the major part of memory consumer was (Author: zhoujira86): [~yunta] thx some background info jemalloc version: I updated the jemalloc version from 3.6.0-11 to 5.0.1 the first set of data I collected is setPinL0FilterAndIndexBlocksInCache false, and set the flink kafka offset to 2days ago. I saw N4 [label="rocksdb\nUncompressBlockContentsForCompressionType\n1724255408 (63.3%)\r",shape=box,fontsize=47.8]; is the major part of memory consumer > pin L0 index in memory can lead to slow memory grow finally lead to memory > beyond limit > --------------------------------------------------------------------------------------- > > Key: FLINK-31089 > URL: https://issues.apache.org/jira/browse/FLINK-31089 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends > Affects Versions: 1.16.1 > Reporter: xiaogang zhou > Priority: Major > Attachments: image-2023-02-15-20-26-58-604.png, > image-2023-02-15-20-32-17-993.png, image-2023-02-17-16-48-59-535.png > > > with the setPinL0FilterAndIndexBlocksInCache true, we can see the pinned > memory kept growing(in the pc blow from 48G-> 50G in about 5 hours). But if > we switch it to false, we can see the pinned memory stay realtive static. In > our environment, a lot of tasks restart due to memory over limit killed by k8s > !image-2023-02-15-20-26-58-604.png|width=899,height=447! > > !image-2023-02-15-20-32-17-993.png|width=853,height=464! > the two graphs are recorded in yesterday and today, which means the data > stream number per second will not differ alot. -- This message was sent by Atlassian Jira (v8.20.10#820010)