Thanks for the reply. Here is an updated exception with DEBUG on. It appears to be timing out:
2021-05-05 16:56:19,700 DEBUG org.apache.flink.kubernetes.kubeclient.FlinkKubeClientFactory [] - Setting namespace of Kubernetes client to cmdaa 2021-05-05 16:56:19,700 DEBUG org.apache.flink.kubernetes.kubeclient.FlinkKubeClientFactory [] - Setting max concurrent requests of Kubernetes client to 64 2021-05-05 16:56:20,176 INFO org.apache.flink.kubernetes.KubernetesClusterDescriptor [] - Retrieve flink cluster flink-jobmanager successfully, JobManager Web Interface: http://10.43.0.1:30081 2021-05-05 16:56:20,239 INFO org.apache.flink.client.cli.CliFrontend [] - Waiting for response... 2021-05-05 17:02:09,605 ERROR org.apache.flink.client.cli.CliFrontend [] - Error while running the command. org.apache.flink.util.FlinkException: Failed to retrieve job list. at org.apache.flink.client.cli.CliFrontend.listJobs(CliFrontend.java:449) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.client.cli.CliFrontend.lambda$list$0(CliFrontend.java:430) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:1002) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.client.cli.CliFrontend.list(CliFrontend.java:427) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1060) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1132) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28) [flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1132) [flink-dist_2.12-1.13.0.jar:1.13.0] Caused by: org.apache.flink.runtime.concurrent.FutureUtils$RetryException: Could not complete the operation. Number of retries has been exhausted. at org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$9(FutureUtils.java:386) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) ~[?:1.8.0_292] at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) ~[?:1.8.0_292] at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) ~[?:1.8.0_292] at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990) ~[?:1.8.0_292] at org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$1(RestClient.java:430) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:577) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:570) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:549) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:490) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:615) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:608) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:117) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe$1.run(AbstractNioChannel.java:263) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:170) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_292] Caused by: java.util.concurrent.CompletionException: org.apache.flink.shaded.netty4.io.netty.channel.ConnectTimeoutException: connection timed out: /10.43.0.1:30081 at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292) ~[?:1.8.0_292] at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308) ~[?:1.8.0_292] at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:957) ~[?:1.8.0_292] at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:940) ~[?:1.8.0_292] at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) ~[?:1.8.0_292] at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990) ~[?:1.8.0_292] at org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$1(RestClient.java:430) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:577) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:570) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:549) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:490) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:615) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:608) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:117) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe$1.run(AbstractNioChannel.java:263) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:170) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_292] Caused by: org.apache.flink.shaded.netty4.io.netty.channel.ConnectTimeoutException: connection timed out: /10.43.0.1:30081 at org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe$1.run(AbstractNioChannel.java:261) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:170) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_292] ngleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.shaded.netty4.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_292] On Wed, May 5, 2021 at 6:59 AM Robert Metzger <rmetz...@apache.org> wrote: > Hi, > can you check the client log in the "log/" directory? > The Flink client will try to access the K8s API server to retrieve the > endpoint of the jobmanager. For that, the pod needs to have permissions > (through a service account) to make such calls to K8s. My hope is that the > logs or previous messages are giving an indication into what Flink is > trying to do. > Can you also try running on DEBUG log level? (should be the > log4j-cli.properties file). > > > > On Tue, May 4, 2021 at 3:17 PM Robert Cullen <cinquate...@gmail.com> > wrote: > >> I have a flink cluster running in kubernetes, just the basic installation >> with one JobManager and two TaskManagers. I want to interact with it via >> command line from a separate container ie: >> >> root@flink-client:/opt/flink# ./bin/flink list --target >> kubernetes-application -Dkubernetes.cluster-id=job-manager >> >> How do you interact in the same kubernetes instance via CLI (Not from the >> desktop)? This is the exception: >> >> ------------------------------------------------------------ >> The program finished with the following exception: >> >> java.lang.RuntimeException: >> org.apache.flink.client.deployment.ClusterRetrieveException: Could not get >> the rest endpoint of job-manager >> at >> org.apache.flink.kubernetes.KubernetesClusterDescriptor.lambda$createClusterClientProvider$0(KubernetesClusterDescriptor.java:103) >> at >> org.apache.flink.kubernetes.KubernetesClusterDescriptor.retrieve(KubernetesClusterDescriptor.java:145) >> at >> org.apache.flink.kubernetes.KubernetesClusterDescriptor.retrieve(KubernetesClusterDescriptor.java:67) >> at >> org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:1001) >> at org.apache.flink.client.cli.CliFrontend.list(CliFrontend.java:427) >> at >> org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1060) >> at >> org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1132) >> at >> org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28) >> at >> org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1132) >> Caused by: org.apache.flink.client.deployment.ClusterRetrieveException: >> Could not get the rest endpoint of job-manager >> ... 9 more >> root@flink-client:/opt/flink# >> >> -- >> Robert Cullen >> 240-475-4490 >> > -- Robert Cullen 240-475-4490