Steven Zhen Wu created FLINK-9693: ------------------------------------- Summary: possible memory link in jobmanager retaining archived checkpoints Key: FLINK-9693 URL: https://issues.apache.org/jira/browse/FLINK-9693 Project: Flink Issue Type: Bug Components: JobManager, State Backends, Checkpointing Environment: !image.png!!image (1).png! Reporter: Steven Zhen Wu Attachments: image (1).png, image.png
First, some context about the job * Flink 1.4.1 * embarrassingly parallel: all operators are chained together * parallelism is over 1,000 * stateless except for Kafka source operators. checkpoint size is 8.4 MB. * set "state.backend.fs.memory-threshold" so that only jobmanager writes to S3 to checkpoint * internal checkpoint with 10 checkpoints retained in history Summary of the observations * 41,567 ExecutionVertex objects retained 9+ GB of memory * Expanded in one ExecutionVertex. it seems to storing the kafka offsets for source operator -- This message was sent by Atlassian JIRA (v7.6.3#76005)