Re: flink list and flink run commands timeout

Chesnay Schepler Wed, 05 Sep 2018 00:41:52 -0700

Please enable DEBUG logging for the client and TRACE logging for thecluster.

For the client, look for log messages starting with "Sending requestof", this will contain the host and port that requests are sent to bythe client. Verify that these are correct and match the host/port thatyou use when accessing the web UI.

For the server, look for log messages starting with "Received request",so we can figure out whether the request at least arrives.


On 05.09.2018 00:53, Jason Kania wrote:

I have upgraded from Flink 1.4.0 to Flink 1.5.3 with a three nodecluster configured with HA. Now I am encountering an issue where theflink command line operations timeout. The exception generated is verypoor because it only indicates a timeout and not the reason or what itwas trying to do:
>./flink list -f
Waiting for response...

------------------------------------------------------------
 The program finished with the following exception:
org.apache.flink.util.FlinkException: Failed to retrieve job list.
atorg.apache.flink.client.cli.CliFrontend.listJobs(CliFrontend.java:433)atorg.apache.flink.client.cli.CliFrontend.lambda$list$0(CliFrontend.java:416)atorg.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:960)atorg.apache.flink.client.cli.CliFrontend.list(CliFrontend.java:413)atorg.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1028)atorg.apache.flink.client.cli.CliFrontend.lambda$main$9(CliFrontend.java:1101)atorg.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)atorg.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1101)Caused by:org.apache.flink.runtime.concurrent.FutureUtils$RetryException: Couldnot complete the operation. Exception is not retryable.atorg.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$5(FutureUtils.java:213)atjava.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)atjava.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)atjava.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)atjava.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)atorg.apache.flink.runtime.concurrent.FutureUtils$Timeout.run(FutureUtils.java:793)atjava.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
atjava.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)atjava.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)atjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)atjava.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.CompletionException:java.util.concurrent.TimeoutExceptionatjava.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)atjava.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)atjava.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:593)atjava.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
        ... 10 more
Caused by: java.util.concurrent.TimeoutException
The web interface shows the 2 job managers and 3 task managers thatare talking with one another.
I have looked at the zookeeper data and it is all present.
I have tried running the command on multiple nodes and they all givethe same error.
I looked for a verbose or debug option for the commands but found nothing.

Suggestions on this?

Thanks,

Jason

Re: flink list and flink run commands timeout

Reply via email to