[ https://issues.apache.org/jira/browse/FLINK-31089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17690329#comment-17690329 ]
xiaogang zhou commented on FLINK-31089: --------------------------------------- [~yunta] thx some background info jemalloc version: I updated the jemalloc version from 3.6.0-11 to 5.0.1 the first set of data I collected is setPinL0FilterAndIndexBlocksInCache false, and set the flink kafka offset to 2days ago. I saw N4 [label="rocksdb\nUncompressBlockContentsForCompressionType\n1724255408 (63.3%)\r",shape=box,fontsize=47.8]; is the major part of memory consumer > pin L0 index in memory can lead to slow memory grow finally lead to memory > beyond limit > --------------------------------------------------------------------------------------- > > Key: FLINK-31089 > URL: https://issues.apache.org/jira/browse/FLINK-31089 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends > Affects Versions: 1.16.1 > Reporter: xiaogang zhou > Priority: Major > Attachments: image-2023-02-15-20-26-58-604.png, > image-2023-02-15-20-32-17-993.png, image-2023-02-17-16-48-59-535.png > > > with the setPinL0FilterAndIndexBlocksInCache true, we can see the pinned > memory kept growing(in the pc blow from 48G-> 50G in about 5 hours). But if > we switch it to false, we can see the pinned memory stay realtive static. In > our environment, a lot of tasks restart due to memory over limit killed by k8s > !image-2023-02-15-20-26-58-604.png|width=899,height=447! > > !image-2023-02-15-20-32-17-993.png|width=853,height=464! > the two graphs are recorded in yesterday and today, which means the data > stream number per second will not differ alot. -- This message was sent by Atlassian Jira (v8.20.10#820010)