Hi Yang, Thanks for the response. I will collect the jobmanager logs and share. Is stop command applicable only for streaming jobs? As I can see in the documentation its mentioned for streaming jobs only. If so how I can handle the batch jobs.
* Cancel a job with a savepoint (deprecated; use “stop” instead): · ./bin/flink cancel -s [targetDirectory] <jobID> * Gracefully stop a job with a savepoint (streaming jobs only): ./bin/flink stop [-p targetDirectory] [-d] <jobID> Thanks, Suchithra From: Yang Wang <danrtsey...@gmail.com> Sent: Thursday, December 10, 2020 11:16 AM To: Yun Tang <myas...@live.com> Cc: V N, Suchithra (Nokia - IN/Bangalore) <suchithra....@nokia.com>; user@flink.apache.org Subject: Re: Flink cli Stop command exception Maybe FLINK-16626[1] is related. And it is fixed in 1.10.1 and 1.11. [1]. https://issues.apache.org/jira/browse/FLINK-16626 Best, Yang Yun Tang <myas...@live.com<mailto:myas...@live.com>> 于2020年12月10日周四 上午11:06写道: Hi Suchithra, Have you ever checked job manager log to see whether the savepoint is triggered and why the savepoint failed to complete. Best Yun Tang ________________________________ From: V N, Suchithra (Nokia - IN/Bangalore) <suchithra....@nokia.com<mailto:suchithra....@nokia.com>> Sent: Wednesday, December 9, 2020 23:45 To: user@flink.apache.org<mailto:user@flink.apache.org> <user@flink.apache.org<mailto:user@flink.apache.org>> Subject: Flink cli Stop command exception Hello, I am running streaming flink job and I was using cancel command with savepoint to cancel the job. From flink 1.10 version stop command should be used instead of cancel command. But I am getting below error sometimes. Please let me know what might be the issue. {"host":"cancel1-flinkcli-jobsubmission-55tgq","level":"info","log":{"message":"The flink command to be executed is /opt/flink/bin/flink stop -p /opt/flink/share/cflkt-flink/external_pvc -d ec416bf906915e570ef53b242d3d0bb0 "},"time":"2020-09-02T12:32:19.979Z","type":"log"} {"host":"cancel1-flinkcli-jobsubmission-55tgq","level":"info","log":{"message":"===== Submitting the Flink job "},"time":"2020-09-02T12:32:19.983Z","type":"log"} WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by org.apache.hadoop.security.authentication.util.KerberosUtil (file:/opt/flink/lib/flink-shaded-hadoop-2-uber-2.6.5-7.0.jar) to method sun.security.krb5.Config.getInstance() WARNING: Please consider reporting this to the maintainers of org.apache.hadoop.security.authentication.util.KerberosUtil WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: All illegal access operations will be denied in a future release ------------------------------------------------------------ The program finished with the following exception: org.apache.flink.util.FlinkException: Could not stop with a savepoint job "ec416bf906915e570ef53b242d3d0bb0". at org.apache.flink.client.cli.CliFrontend.lambda$stop$5(CliFrontend.java:458) at org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:841) at org.apache.flink.client.cli.CliFrontend.stop(CliFrontend.java:450) at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:905) at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:966) at java.base/java.security.AccessController.doPrivileged(Native Method) at java.base/javax.security.auth.Subject.doAs(Subject.java:423) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692) at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:966) Caused by: java.util.concurrent.TimeoutException at java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886) at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021) at org.apache.flink.client.cli.CliFrontend.lambda$stop$5(CliFrontend.java:456) ... 9 more Thanks, Suchithra