[ https://issues.apache.org/jira/browse/FLINK-9043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16475149#comment-16475149 ]
godfrey johnson commented on FLINK-9043: ---------------------------------------- [~sihuazhou] Thanks for your commits. As for the \{{ /user_define_path/cluster_id/job_id/ }} , should user to know the cluster_id and job_id? > Introduce a friendly way to resume the job from externalized checkpoints > automatically > -------------------------------------------------------------------------------------- > > Key: FLINK-9043 > URL: https://issues.apache.org/jira/browse/FLINK-9043 > Project: Flink > Issue Type: New Feature > Reporter: godfrey johnson > Assignee: Sihua Zhou > Priority: Major > > I know a flink job can reovery from checkpoint with restart strategy, but can > not recovery as spark streaming jobs when job is starting. > Every time, the submitted flink job is regarded as a new job, while , in the > spark streaming job, which can detect the checkpoint directory first, and > then recovery from the latest succeed one. However, Flink only can recovery > until the job failed first, then retry with strategy. > > So, would flink support to recover from the checkpoint directly in a new job? > h2. New description by [~sihuazhou] > Currently, it's quite a bit not friendly for users to recover job from the > externalized checkpoint, user need to find the dedicate dir for the job which > is not a easy thing when there are too many jobs. This ticket attend to > introduce a more friendly way to allow the user to use the externalized > checkpoint to do recovery. > The implementation steps are copied from the comments of [~StephanEwen]: > - We could make this an option where you pass a flag (-r) to automatically > look for the latest checkpoint in a given directory. > - If more than one jobs checkpointed there before, this operation would fail. > - We might also need a way to have jobs not create the UUID subdirectory, > otherwise the scanning for the latest checkpoint would not easily work. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)