Hi Andrey,
Yes .Setting setFailOnCheckpointingErrors(false) solved the problem.
But in between I am getting this error :
2019-01-16 21:07:26,979 ERROR
org.apache.flink.runtime.rest.handler.taskmanager.TaskManagerDetailsHandler
- Implementation error: Unhandled exception.
org.apache.flink.runti
Hi Sohi,
Could it be that you configured your job tasks to fail if checkpoint fails
(streamExecutionEnvironment.getCheckpointConfig().setFailOnCheckpointingErrors(true))?
Could you send the complete job master log?
If checkpoint 470 has been subsumed by 471, it could be that its directory
is remo
Hi Sohimankotia,
you can control Flink's failure behaviour in case of a checkpoint failure
via the `ExecutionConfig#setFailTaskOnCheckpointError(boolean)`. Per
default it is set to true which means that a Flink task will fail if a
checkpoint error occurs. If you set it to false, then the job won't
Hi, Sohi
You can check out doc[1][2] to find out the answer.
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/stream/state/checkpointing.html#enabling-and-configuring-checkpointing
[2]
https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/restart_strategies.html
sohim
Yes. File got deleted .
2019-01-15 10:40:41,360 INFO FSNamesystem.audit: allowed=true ugi=hdfs
(auth:SIMPLE) ip=/192.168.3.184 cmd=delete
src=/pipeline/job/checkpoints/e9a08c0661a6c31b5af540cf352e1265/chk-470/5fb3a899-8c0f-45f6-a847-42cbb71e6d19
dst=nullperm=null
Hi, Sohi
Seems like the checkpoint file
`hdfs:/pipeline/job/checkpoints/e9a08c0661a6c31b5af540cf352e1265/chk-470/5fb3a899-8c0f-45f6-a847-42cbb71e6d19`
did not exist for some reason, you can check the life cycle of this file
from hdfs audit log and find out why the file did not exist. maybe the
chec
Hi ,
Flink - 1.5.5
My Streaming job has checkpoint every minute . I am getting following
exception.
2019-01-15 01:59:04,680 INFO
org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed
checkpoint 469 for job e9a08c0661a6c31b5af540cf352e1265 (2736 bytes in 124
ms).
2019-01-15 0