rkhachatryan commented on PR #20091:
URL: https://github.com/apache/flink/pull/20091#issuecomment-1175451454

   Regarding `TRIGGER_CHECKPOINT_FAILURE`,
   
   I **had** the following concerns - but after checking the code they turned 
out to be wrong:
   If counted as failure:
   - "not all tasks are running" could cause it - wrong: there is a separate 
`NOT_ALL_REQUIRED_TASKS_RUNNING` constant
   - timing issues or existing concurrent checkpoints can produce it - wrong: 
they are separate constants as well
   
   If not counted as failure (left as is):
   - if we check `mkdirs` return value then it will cause 
`TRIGGER_CHECKPOINT_FAILURE`, again breaking the counter - wrong: `mkdirs` is 
called during the `CheckpointCoordinator` initialization
   
   Still, I'm leaning towards leaving it as is because there are probably other 
edge cases when it happens and shouldn't be counted as a failure, e.g. shutdown.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to