Not that I am aware of. This is most probably a bug. Looking at the code of the ExecutionGraph:
A job can only be cancelled when the job status is CREATED or RUNNING. If the job failed during execution it is in state FAILED until it is RESTARTING. After resetting the ExecutionGraph state, the state is CREATED (now it’s cancellable) until it's scheduled for execution, which then fails it again. It should work if the cancelling happens right before trying to schedule it. :D – Ufuk > On 13 Nov 2015, at 15:07, Gyula Fóra <gyula.f...@gmail.com> wrote: > > Hey, > > Is there any other way to cancel a job besides ./bin/flink cancel jobId? > This doesnt seem to work when a job cannot be scheduled and is retrying > over and over again. > > The exception I get: > > 13:58:11,240 INFO org.apache.flink.runtime.jobmanager.JobManager > - Status of job 0c895d22c632de5dfe16c42a9ba818d5 (player-id) > changed to RESTARTING. > 13:58:25,234 INFO org.apache.flink.runtime.jobmanager.JobManager > - Trying to cancel job with ID > 0c895d22c632de5dfe16c42a9ba818d5. > 13:58:25,561 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink@127.0.0.1:42012] has failed, address is now gated > for [5000] ms. Reason is: [Disassociated]. > > > I will open a JIRA for this, in the meantime it would still be good to > kill it somehow. > > > Cheers, > > Gyula