Hi,
I have a streaming job that is written in Apache Beam and uses Flink as its
runner. The job is working as expected for about 15 hours and then it
started to have checkpointing error. The error message looks like this
java.lang.Exception: Could not perform checkpoint 910 for operator
Source:
@Kye , Thanks for your suggestions, we are using one yarn app per job mode
and your point is still valid in Flink 1.10 as per docs, it does make sense
to avoid dynamic classloading for such jobs. Also, we seemed to have enough
off heap for resources mentioned and what turned out to be the issue was