Hello all, We are trying to run a Flink job in standalone mode using the official docker image on k8s. As per this documentation <https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/resource-providers/standalone/docker/#advanced-customization> we have created our custom docker image that extends from the official image and does some pre start actions. And finally does `exec /docker-entrypoint.sh standalone-job "$1"` to run the job manager. We have ensured that flink-conf.yaml is present at expected path i.e. $FLINK_HOME"/conf/flink-conf.yaml and have setup JOB_MANAGER_RPC_ADDRESS from pod IP.
We submit our job for execution in application's main thread using `StreamExecutionEnvironment#executeAsync`. But while submitting the job we are consistently getting AskTimeout exception from dispatcher#SubmitJob. ( see logs below ) Based on some previous answers on mailing lists and issues, we tried increasing "web.timeout" and "akka.ask.timeout" but neither of that helped. It seems like the timeout value used for this particular future is hardcoded in code. somewhere. Would be great if someone can provide some help / pointers on what we are missing or things that we should check for. Error logs: *Caused by: java.util.concurrent.TimeoutException: Invocation of public abstract java.util.concurrent.CompletableFuture org.apache.flink.runtime.dispatcher.DispatcherGateway.submitJob(org.apache.flink.runtime.jobgraph.JobGraph,org.apache.flink.api.common.time.Time) timed out. at org.apache.flink.runtime.rpc.akka.$Proxy31.submitJob(Unknown Source) ~[?:1.13.2] at org.apache.flink.client.deployment.application.executors.EmbeddedExecutor.lambda$submitJob$6(EmbeddedExecutor.java:183) ~[flink-dist_2.12-1.13.2.jar:1.13.2] at java.util.concurrent.CompletableFuture$UniCompose.tryFire(Unknown Source) ~[?:?] at java.util.concurrent.CompletableFuture.postComplete(Unknown Source) ~[?:?] at java.util.concurrent.CompletableFuture.complete(Unknown Source) ~[?:?] at org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHandler.java:237) ~[flink-dist_2.12-1.13.2.jar:1.13.2] at java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown Source) ~[?:?] at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(Unknown Source) ~[?*. . . . . *Caused by: akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://flink/user/rpc/dispatcher_1#2019478781]] after [60000 ms]. Message of type [org.apache.flink.runtime.rpc.messages.LocalFencedMessage]. A typical reason for `AskTimeoutException` is that the recipient actor didn't send a reply. at akka.pattern.PromiseActorRef$.$anonfun$defaultOnTimeout$1(AskSupport.scala:635) ~[flink-dist_2.12-1.13.2.jar:1.13.2] at akka.pattern.PromiseActorRef$.$anonfun$apply$1(AskSupport.scala:650) ~[flink-dist_2.12-1.13.2.jar:1.13.2] at akka.actor.Scheduler$$anon$4.run(Scheduler.scala:205) ~[flink-dist_2.12-1.13.2.jar:1.13.2] at scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:870) ~[flink-dist_2.12-1.13.2.jar:1.13.2] at scala.concurrent.BatchingExecutor.execute(BatchingExecutor.scala:109) ~[flink-dist_2.12-1.13.2.jar:1.13.2] at scala.concurrent.BatchingExecutor.execute$(BatchingExecutor.scala:103) ~[flink-dist_2.12-1.13.2.jar:1.13.2] at scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:868) ~[flink-dist_2.12-1.13.2.jar:1.13.2] at akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:328) ~[flink.jar:?] at akka.actor.LightArrayRevolverScheduler$$anon$3.executeBucket$1(LightArrayRevolverScheduler.scala:279) ~[flink.jar:?] at akka.actor.LightArrayRevolverScheduler$$anon$3.nextTick(LightArrayRevolverScheduler.scala:283) ~[flink.jar:?]* - Dhanesh Arole