Qian Chao created FLINK-21437:
---------------------------------

             Summary: Memory leak when using filesystem state backend on 
Alibaba Cloud OSS
                 Key: FLINK-21437
                 URL: https://issues.apache.org/jira/browse/FLINK-21437
             Project: Flink
          Issue Type: Bug
          Components: Runtime / State Backends
            Reporter: Qian Chao


When using filesystem state backend, and storing checkpoints on Alibaba Cloud 
OSS

flink-conf.yaml:
{code:java}
state.backend: filesystem
state.checkpoints.dir: oss://yourBucket/checkpoints
fs.oss.endpoint: xxxxx
fs.oss.accessKeyId: xxxxx
fs.oss.accessKeySecret: xxxxx{code}
 

A memory leak (both jobmanager and taskmanager) would occur after a period of 
time, objects retained in jvm heap like:

 
{code:java}
The class "java.io.DeleteOnExitHook", loaded by "<system class loader>", 
occupies 1,018,323,960 (96.47%) bytes. The memory is accumulated in one 
instance of "java.util.LinkedHashMap", loaded by "<system class loader>", which 
occupies 1,018,323,832 (96.47%) bytes.
{code}
 

 

The root cause should be that when using flink-oss-fs-hadoop to upload file to 
OSS, OSSFileSystem will create temporary file, and deleteOnExit, so 
LinkedHashSet<String> files in DeleteOnExitHook will get bigger and bigger.
{code:java}
org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem::create
-> 
org.apache.hadoop.fs.aliyun.oss.AliyunOSSOutputStream::new 
-> 
dirAlloc.createTmpFileForWrite("output-", -1L, conf) 
-> 
org.apache.hadoop.fs.LocalDirAllocator::createTmpFileForWrite 
-> 
result.deleteOnExit()
{code}
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to