I have upgraded from Flink 1.4.0 to Flink 1.5.3 with a three node cluster 
configured with HA. Now I am encountering an issue where the flink command line 
operations timeout. The exception generated is very poor because it only 
indicates a timeout and not the reason or what it was trying to do:
>./flink list -fWaiting for response...
------------------------------------------------------------ The program 
finished with the following exception:org.apache.flink.util.FlinkException: 
Failed to retrieve job list.        at 
org.apache.flink.client.cli.CliFrontend.listJobs(CliFrontend.java:433)        
at org.apache.flink.client.cli.CliFrontend.lambda$list$0(CliFrontend.java:416)  
      at 
org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:960)  
      at org.apache.flink.client.cli.CliFrontend.list(CliFrontend.java:413)     
   at 
org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1028)  
      at 
org.apache.flink.client.cli.CliFrontend.lambda$main$9(CliFrontend.java:1101)    
    at 
org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
        at 
org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1101)Caused by: 
org.apache.flink.runtime.concurrent.FutureUtils$RetryException: Could not 
complete the operation. Exception is not retryable.        at 
org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$5(FutureUtils.java:213)
        at 
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
        at 
java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
        at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) 
       at 
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
        at 
org.apache.flink.runtime.concurrent.FutureUtils$Timeout.run(FutureUtils.java:793)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)        
at java.util.concurrent.FutureTask.run(FutureTask.java:266)        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
       at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
       at java.lang.Thread.run(Thread.java:748)Caused by: 
java.util.concurrent.CompletionException: java.util.concurrent.TimeoutException 
       at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
        at 
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
        at 
java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:593)     
   at 
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
        ... 10 moreCaused by: java.util.concurrent.TimeoutException
The web interface shows the 2 job managers and 3 task managers that are talking 
with one another.
I have looked at the zookeeper data and it is all present.
I have tried running the command on multiple nodes and they all give the same 
error.
I looked for a verbose or debug option for the commands but found nothing.
Suggestions on this?
Thanks,
Jason

Reply via email to