Heap Problem with Checkpoints

Fabian Wollert Fri, 08 Jun 2018 09:32:51 -0700

Hi, in this email thread
<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-and-AWS-S3-integration-java-lang-NullPointerException-null-uri-host-td20413.html>
here, i tried to set up S3 as a filesystem backend for checkpoints. Now
everything is working (Flink V1.5.0), but the JobMaster is accumulating
Heap space, with eventually killing itself with HeapSpace OOM after several
hours. If I don't enable Checkpointing, then everything is fine. I'm using
the Flink S3 Shaded Libs (tried both the Hadoop and the Presto lib, no
difference in this regard) from the tutorial. my checkpoint settings are
this (job level):


env.enableCheckpointing(1000);
env.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE);
env.getCheckpointConfig().setMinPauseBetweenCheckpoints(5000);
env.getCheckpointConfig().setCheckpointTimeout(60000);
env.getCheckpointConfig().setMaxConcurrentCheckpoints(1);

Another clue why i suspect the S3 Checkpointing is that the heapspace dump
contains a lot of char[] objects with some logs about S3 operations.

anyone has an idea where to look further on this?

Cheers

--


*Fabian WollertZalando SE*

E-Mail: fabian.woll...@zalando.de

Tamara-Danz-Straße 1
10243 Berlin
Fax: +49 (0)30 2759 46 93
E-mail: legalnot...@zalando.co.uk
Notifications of major holdings (Sec. 33, 38, 39 WpHG):  +49 (0)30
2000889349

Management Board:
Robert Gentz, David Schneider, Rubin Ritter

Chairman of the Supervisory Board:
Lothar Lanz

Person responsible for providing the contents of Zalando SE acc. to Art. 55
RStV [Interstate Broadcasting Agreement]: Rubin Ritter
Registered at the Local Court Charlottenburg Berlin, HRB 158855 B
VAT registration number: DE 260543043

Heap Problem with Checkpoints

Reply via email to