Thanks for the reply. Here is an updated exception with DEBUG on. It
appears to be timing out:

2021-05-05 16:56:19,700 DEBUG
org.apache.flink.kubernetes.kubeclient.FlinkKubeClientFactory [] -
Setting namespace of Kubernetes client to cmdaa
2021-05-05 16:56:19,700 DEBUG
org.apache.flink.kubernetes.kubeclient.FlinkKubeClientFactory [] -
Setting max concurrent requests of Kubernetes client to 64
2021-05-05 16:56:20,176 INFO
org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] -
Retrieve flink cluster flink-jobmanager successfully, JobManager Web
Interface: http://10.43.0.1:30081
2021-05-05 16:56:20,239 INFO  org.apache.flink.client.cli.CliFrontend
                    [] - Waiting for response...
2021-05-05 17:02:09,605 ERROR org.apache.flink.client.cli.CliFrontend
                    [] - Error while running the command.
org.apache.flink.util.FlinkException: Failed to retrieve job list.
        at 
org.apache.flink.client.cli.CliFrontend.listJobs(CliFrontend.java:449)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.client.cli.CliFrontend.lambda$list$0(CliFrontend.java:430)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:1002)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at org.apache.flink.client.cli.CliFrontend.list(CliFrontend.java:427)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1060)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1132)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28)
[flink-dist_2.12-1.13.0.jar:1.13.0]
        at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1132)
[flink-dist_2.12-1.13.0.jar:1.13.0]
Caused by: org.apache.flink.runtime.concurrent.FutureUtils$RetryException:
Could not complete the operation. Number of retries has been
exhausted.
        at 
org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$9(FutureUtils.java:386)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
~[?:1.8.0_292]
        at 
java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
~[?:1.8.0_292]
        at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
~[?:1.8.0_292]
        at 
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
~[?:1.8.0_292]
        at 
org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$1(RestClient.java:430)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:577)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:570)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:549)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:490)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:615)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:608)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:117)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe$1.run(AbstractNioChannel.java:263)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:170)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_292]
Caused by: java.util.concurrent.CompletionException:
org.apache.flink.shaded.netty4.io.netty.channel.ConnectTimeoutException:
connection timed out: /10.43.0.1:30081
        at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
~[?:1.8.0_292]
        at 
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
~[?:1.8.0_292]
        at 
java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:957)
~[?:1.8.0_292]
        at 
java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:940)
~[?:1.8.0_292]
        at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
~[?:1.8.0_292]
        at 
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
~[?:1.8.0_292]
        at 
org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$1(RestClient.java:430)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:577)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:570)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:549)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:490)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:615)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:608)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:117)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe$1.run(AbstractNioChannel.java:263)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:170)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_292]
Caused by: 
org.apache.flink.shaded.netty4.io.netty.channel.ConnectTimeoutException:
connection timed out: /10.43.0.1:30081
        at 
org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe$1.run(AbstractNioChannel.java:261)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:170)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_292]
ngleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at 
org.apache.flink.shaded.netty4.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
~[flink-dist_2.12-1.13.0.jar:1.13.0]
        at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_292]


On Wed, May 5, 2021 at 6:59 AM Robert Metzger <rmetz...@apache.org> wrote:

> Hi,
> can you check the client log in the "log/" directory?
> The Flink client will try to access the K8s API server to retrieve the
> endpoint of the jobmanager. For that, the pod needs to have permissions
> (through a service account) to make such calls to K8s. My hope is that the
> logs or previous messages are giving an indication into what Flink is
> trying to do.
> Can you also try running on DEBUG log level? (should be the
> log4j-cli.properties file).
>
>
>
> On Tue, May 4, 2021 at 3:17 PM Robert Cullen <cinquate...@gmail.com>
> wrote:
>
>> I have a flink cluster running in kubernetes, just the basic installation
>> with one JobManager and two TaskManagers. I want to interact with it via
>> command line from a separate container ie:
>>
>> root@flink-client:/opt/flink# ./bin/flink list --target 
>> kubernetes-application -Dkubernetes.cluster-id=job-manager
>>
>> How do you interact in the same kubernetes instance via CLI (Not from the
>> desktop)?  This is the exception:
>>
>> ------------------------------------------------------------
>>  The program finished with the following exception:
>>
>> java.lang.RuntimeException: 
>> org.apache.flink.client.deployment.ClusterRetrieveException: Could not get 
>> the rest endpoint of job-manager
>>         at 
>> org.apache.flink.kubernetes.KubernetesClusterDescriptor.lambda$createClusterClientProvider$0(KubernetesClusterDescriptor.java:103)
>>         at 
>> org.apache.flink.kubernetes.KubernetesClusterDescriptor.retrieve(KubernetesClusterDescriptor.java:145)
>>         at 
>> org.apache.flink.kubernetes.KubernetesClusterDescriptor.retrieve(KubernetesClusterDescriptor.java:67)
>>         at 
>> org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:1001)
>>         at org.apache.flink.client.cli.CliFrontend.list(CliFrontend.java:427)
>>         at 
>> org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1060)
>>         at 
>> org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1132)
>>         at 
>> org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28)
>>         at 
>> org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1132)
>> Caused by: org.apache.flink.client.deployment.ClusterRetrieveException: 
>> Could not get the rest endpoint of job-manager
>>         ... 9 more
>> root@flink-client:/opt/flink#
>>
>> --
>> Robert Cullen
>> 240-475-4490
>>
>

-- 
Robert Cullen
240-475-4490

Reply via email to