Hi, It's hard for me to guess what could be the problem. There was the same error reported a couple of months ago [1], but there is frankly no extra information there.
Can we start from looking at the full TaskManager and JobManager logs? Could you share them with us? Best, Piotrek [1] http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-1-11-job-stop-with-save-point-timeout-error-td36895.html pt., 11 gru 2020 o 14:04 Folani <hamidreza.ark...@irisa.fr> napisaĆ(a): > I'm deploying a standalone Flink cluster on top of Kubernetes and using > MinIO > as a S3 backend. I mainly follow the instructions in flink's website. > I use the following command to run my job in Flink: $flink run -d -m > <IP>:<port> -j job.jar > > I also have added to flink-configmap.yaml the followings: > > > state.backend: filesystem > state.checkpoints.dir: s3://state/checkpoints > state.savepoints.dir: s3://state/savepoints > s3.path-style-access: true > s3.endpoint: http://minio-service:9000 > s3.access-key: ******* > s3.secret-key: ******* > > It seems that everything is working well. The job is submitted correctly, > the checkpoints are written in minio, but when I try to cancel the job or > stop it with savepoints I get the following exception: > > org.apache.flink.util.FlinkException: Could not stop with a savepoint job > "5ae191ca2b239ec7771e4c7a9a336537". > at > org.apache.flink.client.cli.CliFrontend.lambda$stop$5(CliFrontend.java:495) > at > > org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:864) > at > org.apache.flink.client.cli.CliFrontend.stop(CliFrontend.java:487) > at > > org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:931) > at > > org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:992) > at > > org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30) > at > org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:992) > Caused by: java.util.concurrent.TimeoutException > at > > java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1771) > at > java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915) > at > org.apache.flink.client.cli.CliFrontend.lambda$stop$5(CliFrontend.java:493) > ... 6 more > > This is my command to stop with savepoints: $flink stop -p <JobID> > And my Flink version is flink-1.11.2-bin-scala_2.11. > > What could be the reason of the exception? Any suggestion? > > > > > > > > -- > Sent from: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ >