Hi,
I am having issue where after deploying few jobs, it starts failing with below 
errors. I don't have such issue in other environments. What should I check 
first in such scenario?
My environment is
Azure Kubernetes 1.15.7
Flink 1.6.0
Zookeeper 3.4.10

The program finished with the following exception:
org.apache.flink.client.program.ProgramInvocationException: Could not submit 
job (JobID: e83db2da358db355ccdcf6740c6bb134)
        at 
org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:249)
        at 
org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:486)
        at 
org.apache.flink.client.program.DetachedEnvironment.finalizeExecute(DetachedEnvironment.java:77)
        at 
org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:432)
        at 
org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:804)
        at 
org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:280)
        at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:215)
        at 
org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1044)
        at 
org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1120)
        at 
org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
        at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1120)
Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to 
submit JobGraph.
        at 
org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$8(RestClusterClient.java:379)
        at 
java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
        at 
java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852)
        at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
        at 
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
        at 
org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$5(FutureUtils.java:213)
        at 
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
        at 
java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
        at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
        at 
java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:561)
        at 
java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:929)
        at 
java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.CompletionException: 
org.apache.flink.runtime.concurrent.FutureUtils$RetryException: Could not 
complete the operation. Exception is not retryable.
        at 
java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326)
        at 
java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338)
        at 
java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:911)
        at 
java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:899)
        ... 12 more
Caused by: org.apache.flink.runtime.concurrent.FutureUtils$RetryException: 
Could not complete the operation. Exception is not retryable.
        ... 10 more
Caused by: java.util.concurrent.CompletionException: 
org.apache.flink.runtime.rest.util.RestClientException: [Job submission failed.]
        at 
java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326)
        at 
java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338)
        at 
java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:911)
        at 
java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:953)
        at 
java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926)
        ... 4 more
Caused by: org.apache.flink.runtime.rest.util.RestClientException: [Job 
submission failed.]
        at 
org.apache.flink.runtime.rest.RestClient.parseResponse(RestClient.java:310)
        at 
org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$3(RestClient.java:294)
        at 
java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:952)
        ... 5 more


More errors
at java.lang.Thread.run(Thread.java:748)
Caused by: akka.pattern.AskTimeoutException: Ask timed out on 
[Actor[akka://flink/user/dispatcher#-1725880087]] after [10000 ms]. 
Sender[null] sent message of type 
"org.apache.flink.runtime.rpc.messages.LocalFencedMessage".
        at 
akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:604)
        ... 9 more
2020-03-04 08:39:06,675 ERROR 
org.apache.flink.runtime.rest.handler.cluster.ClusterOverviewHandler  - Could 
not retrieve the redirect address.
java.util.concurrent.CompletionException: akka.pattern.AskTimeoutException: Ask 
timed out on [Actor[akka://flink/user/dispatcher#-1725880087]] after [10000 
ms]. Sender[null] sent message of type 
"org.apache.flink.runtime.rpc.messages.LocalFencedMessage".
        at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
        at 
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
        at 
java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:593)
        at 
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
        at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
        at 
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
        at 
org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:770)
        at akka.dispatch.OnComplete.internal(Future.scala:258)
        at akka.dispatch.OnComplete.internal(Future.scala:256)
        at akka.dispatch.japi$CallbackBridge.apply(Future.scala:186)
        at akka.dispatch.japi$CallbackBridge.apply(Future.scala:183)
        at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
        at 
org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:83)
        at 
scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44)
        at 
scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252)
        at 
akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:603)
        at akka.actor.Scheduler$$anon$4.run(Scheduler.scala:126)
        at 
scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:601)
        at 
scala.concurrent.BatchingExecutor$class.execute(BatchingExecutor.scala:109)
        at 
scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:599)
        at 
akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:329)
        at 
akka.actor.LightArrayRevolverScheduler$$anon$4.executeBucket$1(LightArrayRevolverScheduler.scala:280)
        at 
akka.actor.LightArrayRevolverScheduler$$anon$4.nextTick(LightArrayRevolverScheduler.scala:284)
        at 
akka.actor.LightArrayRevolverScheduler$$anon$4.run(LightArrayRevolverScheduler.scala:236)
        at java.lang.Thread.run(Thread.java:748)
Caused by: akka.pattern.AskTimeoutException: Ask timed out on 
[Actor[akka://flink/user/dispatcher#-1725880087]] after [10000 ms]. 
Sender[null] sent message of type 
"org.apache.flink.runtime.rpc.messages.LocalFencedMessage".
        at 
akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:604)
        ... 9 more
2020-03-04 08:39:07,676 ERROR 
org.apache.flink.runtime.rest.handler.job.JobsOverviewHandler  - Could not 
retrieve the redirect address.
java.util.concurrent.CompletionException: akka.pattern.AskTimeoutException: Ask 
timed out on [Actor[akka://flink/user/dispatcher#-1725880087]] after [10000 
ms]. Sender[null] sent message of type 
"org.apache.flink.runtime.rpc.messages.LocalFencedMessage".
        at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
        at 
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
        at 
java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:593)
        at 
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
        at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
        at 
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
        at 
org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:770)
        at akka.dispatch.OnComplete.internal(Future.scala:258)
        at akka.dispatch.OnComplete.internal(Future.scala:256)
        at akka.dispatch.japi$CallbackBridge.apply(Future.scala:186)
        at akka.dispatch.japi$CallbackBridge.apply(Future.scala:183)
        at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
        at 
org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:83)
        at 
scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44)
        at 
scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252)
        at 
akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:603)
        at akka.actor.Scheduler$$anon$4.run(Scheduler.scala:126)
        at 
scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:601)
        at 
scala.concurrent.BatchingExecutor$class.execute(BatchingExecutor.scala:109)
        at 
scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:599)
        at 
akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:329)
        at 
akka.actor.LightArrayRevolverScheduler$$anon$4.executeBucket$1(LightArrayRevolverScheduler.scala:280)
        at 
akka.actor.LightArrayRevolverScheduler$$anon$4.nextTick(LightArrayRevolverScheduler.scala:284)
        at 
akka.actor.LightArrayRevolverScheduler$$anon$4.run(LightArrayRevolverScheduler.scala:236)
        at java.lang.Thread.run(Thread.java:748)
Caused by: akka.pattern.AskTimeoutException: Ask timed out on 
[Actor[akka://flink/user/dispatcher#-1725880087]] after [10000 ms]. 
Sender[null] sent message of type 
"org.apache.flink.runtime.rpc.messages.LocalFencedMessage".
        at 
akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:604)



Warm Regards,
Samir Chauhan

Regional Infrastructure & Operations

[cid:image002.png@01D12B8E.C23F3E10]

Prudential Services Singapore Pte Ltd
1 Wallich Street #19-01, Guoco Tower Singapore 078881

Direct (65) 6704 7264 Mobile (65) 9721 7548
samir.tusharbhai.chau...@prudential.com.sg<mailto:samir.tusharbhai.chau...@prudential.com.sg>

www.prudential.com.sg<http://www.prudential.com.sg/>


There's a reason we support Fair Dealing. YOU.


This email and any files transmitted with it or attached to it (the [Email]) 
may contain confidential, proprietary or legally privileged information and is 
intended solely for the use of the individual or entity to whom it is 
addressed. If you are not the intended recipient of the Email, you must not, 
directly or indirectly, copy, use, print, distribute, disclose to any other 
party or take any action in reliance on any part of the Email. Please notify 
the system manager or sender of the error and delete all copies of the Email 
immediately.  

No statement in the Email should be construed as investment advice being given 
within or outside Singapore. Prudential Assurance Company Singapore (Pte) 
Limited (PACS)  and each of its related entities shall not be responsible for 
any losses, claims, penalties, costs or damages arising from or in connection 
with the use of the Email or the information therein, in whole or in part. You 
are solely responsible for conducting any virus checks prior to opening, 
accessing or disseminating the Email.

PACS (Company Registration No. 199002477Z) is a company incorporated under the 
laws of Singapore and has its registered office at 30 Cecil Street, #30-01, 
Prudential Tower, Singapore 049712.

PACS is an indirect wholly owned subsidiary of Prudential plc of the United 
Kingdom. PACS and Prudential plc are not affiliated in any manner with 
Prudential Financial, Inc., a company whose principal place of business is in 
the United States of America.

Reply via email to