Hi Farouk,

This issue does not relate to checkpoints. The JM launching fails due to
the job's user jar blob is missing on HDFS.
Does this issue always happen? If it rarely occurs, the file might be
unexpectedly deleted by someone else.

Thanks,
Zhu Zhu

Farouk <farouk.za...@gmail.com> 于2019年8月1日周四 下午5:22写道:

> Hi
>
> We have Flink running on Kubernetes with HDFS. The JM crashed for some
> reasons.
>
> Has anybody already encounter an error like in the logfile attached ?
>
> Caused by: java.lang.Exception: Cannot set up the user code libraries:
> File does not exist:
> /projects/dev/flink-recovery/default/blob/job_c9642e4287d7075b53922fba162665d0/blob_p-0debc7cbf567a71ea6c8fc3efb5855aa6617fdea-0e5d3178b26aef6112fed55559d41634
> at
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71)
> at
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:2025)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1996)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1909)
>
> Thanks
> Farouk
>

Reply via email to