Pulling in Yang Wang who may shed some light on the matter.

You could also have a look at http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Native-kubernetes-setup-failed-to-start-job-td39066.html; while the issue was not actually resolved it may give some hints.

On 5/10/2021 4:40 PM, Valentin Wallyn wrote:
Hi,

I'm trying to use Flink on native kubernetes (https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/resource-providers/native_kubernetes/ <https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/resource-providers/native_kubernetes/>) but I have an error even with the example from the documentation.

The job get submitted but stays in "created" status until it timeouts after 5 minutes. In the log of the task manager, I can see that the error is "*Could not resolve ResourceManager address"*
*
*
What can be the issue ?


Here are the logs :

> ./bin/kubernetes-session.sh -Dkubernetes.cluster-id=franz-01

/2021-05-10 16:05:00,392 INFO  org.apache.flink.configuration.GlobalConfiguration         [] - Loading configuration property: jobmanager.rpc.address, localhost 2021-05-10 16:05:00,395 INFO  org.apache.flink.configuration.GlobalConfiguration     [] - Loading configuration property: jobmanager.rpc.port, 6123 2021-05-10 16:05:00,395 INFO  org.apache.flink.configuration.GlobalConfiguration     [] - Loading configuration property: jobmanager.memory.process.size, 1600m 2021-05-10 16:05:00,395 INFO  org.apache.flink.configuration.GlobalConfiguration     [] - Loading configuration property: taskmanager.memory.process.size, 1728m 2021-05-10 16:05:00,395 INFO  org.apache.flink.configuration.GlobalConfiguration     [] - Loading configuration property: taskmanager.numberOfTaskSlots, 1 2021-05-10 16:05:00,395 INFO  org.apache.flink.configuration.GlobalConfiguration     [] - Loading configuration property: parallelism.default, 1 2021-05-10 16:05:00,396 INFO  org.apache.flink.configuration.GlobalConfiguration     [] - Loading configuration property: jobmanager.execution.failover-strategy, region 2021-05-10 16:05:00,432 INFO  org.apache.flink.client.deployment.DefaultClusterClientServiceLoader [] - Could not load factory due to missing dependencies. 2021-05-10 16:05:02,680 INFO  org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils [] - The derived from fraction jvm overhead memory (160.000mb (167772162 bytes)) is less than its min value 192.000mb (201326592 bytes), min value will be used instead 2021-05-10 16:05:02,690 INFO  org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils [] - The derived from fraction jvm overhead memory (172.800mb (181193935 bytes)) is less than its min value 192.000mb (201326592 bytes), min value will be used instead 2021-05-10 16:05:02,699 INFO  org.apache.flink.kubernetes.utils.KubernetesUtils    [] - Kubernetes deployment requires a fixed port. Configuration blob.server.port will be set to 6124 2021-05-10 16:05:02,700 INFO  org.apache.flink.kubernetes.utils.KubernetesUtils    [] - Kubernetes deployment requires a fixed port. Configuration taskmanager.rpc.port will be set to 6122 2021-05-10 16:05:02,760 INFO  org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils [] - The derived from fraction jvm overhead memory (160.000mb (167772162 bytes)) is less than its min value 192.000mb (201326592 bytes), min value will be used instead 2021-05-10 16:05:05,440 INFO  org.apache.flink.kubernetes.KubernetesClusterDescriptor    [] - Create flink session cluster franz-01 successfully, JobManager Web Interface: http://xxx:8081 <http://xxx:8081>/


*Task Manager logs*
*
*
/2021-05-10 14:09:05,463 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] -     -D 2021-05-10 14:09:05,464 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] - taskmanager.memory.framework.off-heap.size=134217728b 2021-05-10 14:09:05,464 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] -     -D 2021-05-10 14:09:05,464 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] -     taskmanager.memory.network.max=134217730b 2021-05-10 14:09:05,464 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] -     -D 2021-05-10 14:09:05,464 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] -     taskmanager.memory.network.min=134217730b 2021-05-10 14:09:05,464 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] -     -D 2021-05-10 14:09:05,465 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] -     taskmanager.memory.framework.heap.size=134217728b 2021-05-10 14:09:05,465 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] -     -D 2021-05-10 14:09:05,465 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] -     taskmanager.memory.managed.size=536870920b 2021-05-10 14:09:05,465 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] -     -D 2021-05-10 14:09:05,465 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] -     taskmanager.cpu.cores=1.0 2021-05-10 14:09:05,465 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] -     -D 2021-05-10 14:09:05,465 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] -     taskmanager.memory.task.heap.size=402653174b 2021-05-10 14:09:05,466 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] -     -D 2021-05-10 14:09:05,466 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] -     taskmanager.memory.task.off-heap.size=0b 2021-05-10 14:09:05,466 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] -     -D 2021-05-10 14:09:05,467 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] -     taskmanager.memory.jvm-metaspace.size=268435456b 2021-05-10 14:09:05,467 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] -     -D 2021-05-10 14:09:05,467 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] -     taskmanager.memory.jvm-overhead.max=201326592b 2021-05-10 14:09:05,470 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] -     -D 2021-05-10 14:09:05,470 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] -     taskmanager.memory.jvm-overhead.min=201326592b 2021-05-10 14:09:05,470 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] -     --configDir 2021-05-10 14:09:05,470 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] -     /opt/flink/conf 2021-05-10 14:09:05,470 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] - -Dtaskmanager.resource-id=franz-01-taskmanager-1-1 2021-05-10 14:09:05,471 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] -     -Djobmanager.memory.off-heap.size=134217728b 2021-05-10 14:09:05,471 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] -     -Djobmanager.memory.jvm-overhead.min=201326592b 2021-05-10 14:09:05,472 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] - -Dweb.tmpdir=/tmp/flink-web-e60a7b21-4e2b-4b6c-a0ac-5b08816edcee 2021-05-10 14:09:05,472 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] -     -Djobmanager.memory.jvm-metaspace.size=268435456b 2021-05-10 14:09:05,472 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] -     -Djobmanager.memory.heap.size=1073741824b 2021-05-10 14:09:05,472 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] -     -Djobmanager.memory.jvm-overhead.max=201326592b 2021-05-10 14:09:05,472 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] -  Classpath: /opt/flink/lib/flink-csv-1.12.2.jar:/opt/flink/lib/flink-json-1.12.2.jar:/opt/flink/lib/flink-shaded-zookeeper-3.4.14.jar:/opt/flink/lib/flink-table-blink_2.12-1.12.2.jar:/opt/flink/lib/flink-table_2.12-1.12.2.jar:/opt/flink/lib/log4j-1.2-api-2.12.1.jar:/opt/flink/lib/log4j-api-2.12.1.jar:/opt/flink/lib/log4j-core-2.12.1.jar:/opt/flink/lib/log4j-slf4j-impl-2.12.1.jar:/opt/flink/lib/flink-dist_2.12-1.12.2.jar::: 2021-05-10 14:09:05,472 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] - -------------------------------------------------------------------------------- 2021-05-10 14:09:05,475 INFO  org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner [] - Registered UNIX signal handlers for [TERM, HUP, INT] 2021-05-10 14:09:05,510 INFO  org.apache.flink.configuration.GlobalConfiguration     [] - Loading configuration property: blob.server.port, 6124 2021-05-10 14:09:05,511 INFO  org.apache.flink.configuration.GlobalConfiguration     [] - Loading configuration property: taskmanager.memory.process.size, 1728m 2021-05-10 14:09:05,511 INFO  org.apache.flink.configuration.GlobalConfiguration     [] - Loading configuration property: kubernetes.internal.jobmanager.entrypoint.class, org.apache.flink.kubernetes.entrypoint.KubernetesSessionClusterEntrypoint 2021-05-10 14:09:05,513 INFO  org.apache.flink.configuration.GlobalConfiguration     [] - Loading configuration property: jobmanager.execution.failover-strategy, region 2021-05-10 14:09:05,514 INFO  org.apache.flink.configuration.GlobalConfiguration     [] - Loading configuration property: jobmanager.rpc.address, franz-01.default 2021-05-10 14:09:05,514 INFO  org.apache.flink.configuration.GlobalConfiguration     [] - Loading configuration property: execution.target, kubernetes-session 2021-05-10 14:09:05,515 INFO  org.apache.flink.configuration.GlobalConfiguration     [] - Loading configuration property: jobmanager.memory.process.size, 1600m 2021-05-10 14:09:05,516 INFO  org.apache.flink.configuration.GlobalConfiguration     [] - Loading configuration property: jobmanager.rpc.port, 6123 2021-05-10 14:09:05,516 INFO  org.apache.flink.configuration.GlobalConfiguration     [] - Loading configuration property: kubernetes.cluster-id, franz-01 2021-05-10 14:09:05,516 INFO  org.apache.flink.configuration.GlobalConfiguration     [] - Loading configuration property: taskmanager.rpc.port, 6122 2021-05-10 14:09:05,517 INFO  org.apache.flink.configuration.GlobalConfiguration     [] - Loading configuration property: internal.cluster.execution-mode, NORMAL 2021-05-10 14:09:05,517 INFO  org.apache.flink.configuration.GlobalConfiguration     [] - Loading configuration property: parallelism.default, 1 2021-05-10 14:09:05,519 INFO  org.apache.flink.configuration.GlobalConfiguration     [] - Loading configuration property: taskmanager.numberOfTaskSlots, 1 2021-05-10 14:09:05,658 INFO  org.apache.flink.core.fs.FileSystem    [] - Hadoop is not in the classpath/dependencies. The extended set of supported File Systems via Hadoop is not available. 2021-05-10 14:09:05,733 INFO  org.apache.flink.runtime.security.modules.HadoopModuleFactory [] - Cannot create Hadoop Security Module because Hadoop cannot be found in the Classpath. 2021-05-10 14:09:05,738 INFO  org.apache.flink.runtime.security.modules.JaasModule     [] - Jaas file will be created as /tmp/jaas-3361029581556571704.conf. 2021-05-10 14:09:05,744 INFO  org.apache.flink.runtime.security.contexts.HadoopSecurityContextFactory [] - Cannot install HadoopSecurityContext because Hadoop cannot be found in the Classpath. 2021-05-10 14:09:05,811 INFO  org.apache.flink.configuration.Configuration     [] - Config uses fallback configuration key 'jobmanager.rpc.address' instead of key 'rest.address' 2021-05-10 14:09:05,855 INFO  org.apache.flink.runtime.util.LeaderRetrievalUtils     [] - Trying to select the network interface and address to use by connecting to the leading JobManager. 2021-05-10 14:09:05,855 INFO  org.apache.flink.runtime.util.LeaderRetrievalUtils     [] - TaskManager will try to connect for PT10S before falling back to heuristics 2021-05-10 14:09:26,116 WARN  org.apache.flink.runtime.net.ConnectionUtils     [] - Could not connect to franz-01.default:6123. Selecting a local address using heuristics. 2021-05-10 14:09:26,116 WARN  org.apache.flink.runtime.net.ConnectionUtils     [] - Could not find any IPv4 address that is not loopback or link-local. Using localhost address. 2021-05-10 14:09:26,117 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner    [] - TaskManager will use hostname/address 'franz-01-taskmanager-1-1' (10.2.2.37) for communication. 2021-05-10 14:09:26,136 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils    [] - Trying to start actor system, external address 10.2.2.37:6122 <http://10.2.2.37:6122>, bind address 0.0.0.0:6122 <http://0.0.0.0:6122>. 2021-05-10 14:09:27,212 INFO  akka.event.slf4j.Slf4jLogger                                 [] - Slf4jLogger started 2021-05-10 14:09:27,283 INFO  akka.remote.Remoting                                 [] - Starting remoting 2021-05-10 14:09:27,586 INFO  akka.remote.Remoting                                 [] - Remoting started; listening on addresses :[akka.tcp://flink@10.2.2.37:6122 <http://flink@10.2.2.37:6122>] 2021-05-10 14:09:27,730 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils    [] - Actor system started at akka.tcp://flink@10.2.2.37:6122 <http://flink@10.2.2.37:6122> 2021-05-10 14:09:27,781 INFO  org.apache.flink.runtime.metrics.MetricRegistryImpl    [] - No metrics reporter configured, no metrics will be exposed/reported. 2021-05-10 14:09:27,786 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils    [] - Trying to start actor system, external address 10.2.2.37:0 <http://10.2.2.37:0>, bind address 0.0.0.0:0 <http://0.0.0.0:0>. 2021-05-10 14:09:27,814 INFO  akka.event.slf4j.Slf4jLogger                                 [] - Slf4jLogger started 2021-05-10 14:09:27,819 INFO  akka.remote.Remoting                                 [] - Starting remoting 2021-05-10 14:09:27,881 INFO  akka.remote.Remoting                                 [] - Remoting started; listening on addresses :[akka.tcp://flink-metrics@10.2.2.37:39177 <http://flink-metrics@10.2.2.37:39177>] 2021-05-10 14:09:27,895 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils    [] - Actor system started at akka.tcp://flink-metrics@10.2.2.37:39177 <http://flink-metrics@10.2.2.37:39177> 2021-05-10 14:09:27,916 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcService     [] - Starting RPC endpoint for org.apache.flink.runtime.metrics.dump.MetricQueryService at akka://flink-metrics/user/rpc/MetricQueryService_franz-01-taskmanager-1-1 . 2021-05-10 14:09:27,931 INFO  org.apache.flink.runtime.blob.PermanentBlobCache     [] - Created BLOB cache storage directory /tmp/blobStore-16255e13-c39a-442f-853a-cd1e331e7325 2021-05-10 14:09:27,934 INFO  org.apache.flink.runtime.blob.TransientBlobCache     [] - Created BLOB cache storage directory /tmp/blobStore-5ac02374-808a-4529-b80c-088dbeac2711 2021-05-10 14:09:27,955 INFO  org.apache.flink.runtime.externalresource.ExternalResourceUtils [] - Enabled external resources: [] 2021-05-10 14:09:27,955 INFO  org.apache.flink.runtime.externalresource.ExternalResourceUtils [] - Enabled external resources: [] 2021-05-10 14:09:27,955 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerRunner    [] - Starting TaskManager with ResourceID: franz-01-taskmanager-1-1 2021-05-10 14:09:27,990 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerServices    [] - Temporary file directory '/tmp': total 48 GB, usable 38 GB (79.17% usable) 2021-05-10 14:09:27,997 INFO  org.apache.flink.runtime.io.disk.FileChannelManagerImpl    [] - FileChannelManager uses directory /tmp/flink-io-c08780a7-90bd-4259-8f51-8a24d95c21df for spill files. 2021-05-10 14:09:28,059 INFO  org.apache.flink.runtime.io.network.netty.NettyConfig    [] - NettyConfig [server address: /0.0.0.0 <http://0.0.0.0>, server port: 0, ssl enabled: false, memory segment size (bytes): 32768, transport type: AUTO, number of server threads: 1 (manual), number of client threads: 1 (manual), server connect backlog: 0 (use Netty's default), client connect timeout (sec): 120, send/receive buffer size (bytes): 0 (use Netty's default)] 2021-05-10 14:09:28,063 INFO  org.apache.flink.runtime.io.disk.FileChannelManagerImpl    [] - FileChannelManager uses directory /tmp/flink-netty-shuffle-209ae6cc-6fd5-4c9e-b6df-acc675a6995c for spill files. 2021-05-10 14:09:28,578 INFO  org.apache.flink.runtime.io.network.buffer.NetworkBufferPool [] - Allocated 128 MB for network buffer pool (number of memory segments: 4096, bytes per segment: 32768). 2021-05-10 14:09:28,594 INFO  org.apache.flink.runtime.io.network.NettyShuffleEnvironment  [] - Starting the network environment and its components. 2021-05-10 14:09:28,789 INFO  org.apache.flink.runtime.io.network.netty.NettyClient    [] - Transport type 'auto': using EPOLL. 2021-05-10 14:09:28,791 INFO  org.apache.flink.runtime.io.network.netty.NettyClient    [] - Successful initialization (took 196 ms). 2021-05-10 14:09:28,796 INFO  org.apache.flink.runtime.io.network.netty.NettyServer    [] - Transport type 'auto': using EPOLL. 2021-05-10 14:09:28,892 INFO  org.apache.flink.runtime.io.network.netty.NettyServer    [] - Successful initialization (took 99 ms). Listening on SocketAddress /0:0:0:0:0:0:0:0%0:40399. 2021-05-10 14:09:28,894 INFO  org.apache.flink.runtime.taskexecutor.KvStateService     [] - Starting the kvState service and its components. 2021-05-10 14:09:28,979 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcService     [] - Starting RPC endpoint for org.apache.flink.runtime.taskexecutor.TaskExecutor at akka://flink/user/rpc/taskmanager_0 . 2021-05-10 14:09:29,002 INFO  org.apache.flink.runtime.taskexecutor.DefaultJobLeaderService [] - Start job leader service. 2021-05-10 14:09:29,005 INFO  org.apache.flink.runtime.filecache.FileCache     [] - User file cache uses directory /tmp/flink-dist-cache-bc340200-15c9-4d0a-950a-f43469bdb58d 2021-05-10 14:09:29,055 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor     [] - Connecting to ResourceManager akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*(00000000000000000000000000000000). 2021-05-10 14:09:29,276 WARN  akka.remote.ReliableDeliverySupervisor     [] - Association with remote system [akka.tcp://flink@franz-01.default:6123] has failed, address is now gated for [50] ms. Reason: [Association failed with [akka.tcp://flink@franz-01.default:6123]] Caused by: [java.net.UnknownHostException: franz-01.default] 2021-05-10 14:09:29,289 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor     [] - Could not resolve ResourceManager address akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*, retrying in 10000 ms: Could not connect to rpc endpoint under address akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*. 2021-05-10 14:09:49,314 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor     [] - Could not resolve ResourceManager address akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*, retrying in 10000 ms: Could not connect to rpc endpoint under address akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*. 2021-05-10 14:09:59,325 WARN  akka.remote.ReliableDeliverySupervisor     [] - Association with remote system [akka.tcp://flink@franz-01.default:6123] has failed, address is now gated for [50] ms. Reason: [Association failed with [akka.tcp://flink@franz-01.default:6123]] Caused by: [java.net.UnknownHostException: franz-01.default: Temporary failure in name resolution] 2021-05-10 14:09:59,327 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor     [] - Could not resolve ResourceManager address akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*, retrying in 10000 ms: Could not connect to rpc endpoint under address akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*. 2021-05-10 14:10:19,365 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor     [] - Could not resolve ResourceManager address akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*, retrying in 10000 ms: Could not connect to rpc endpoint under address akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*. 2021-05-10 14:10:29,363 WARN  akka.remote.ReliableDeliverySupervisor     [] - Association with remote system [akka.tcp://flink@franz-01.default:6123] has failed, address is now gated for [50] ms. Reason: [Association failed with [akka.tcp://flink@franz-01.default:6123]] Caused by: [java.net.UnknownHostException: franz-01.default: Temporary failure in name resolution] 2021-05-10 14:10:29,385 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor     [] - Could not resolve ResourceManager address akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*, retrying in 10000 ms: Could not connect to rpc endpoint under address akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*. 2021-05-10 14:10:49,425 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor     [] - Could not resolve ResourceManager address akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*, retrying in 10000 ms: Could not connect to rpc endpoint under address akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*. 2021-05-10 14:10:59,423 WARN  akka.remote.ReliableDeliverySupervisor     [] - Association with remote system [akka.tcp://flink@franz-01.default:6123] has failed, address is now gated for [50] ms. Reason: [Association failed with [akka.tcp://flink@franz-01.default:6123]] Caused by: [java.net.UnknownHostException: franz-01.default: Temporary failure in name resolution] 2021-05-10 14:10:59,445 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor     [] - Could not resolve ResourceManager address akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*, retrying in 10000 ms: Could not connect to rpc endpoint under address akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*.


/
*Job Manager logs*
/
/
/2021-05-10 14:09:00,393 INFO  org.apache.flink.runtime.dispatcher.StandaloneDispatcher     [] - Received JobGraph submission a63f806ba9a172b728395266a6dc41fe (Flink Streaming Job). 2021-05-10 14:09:00,395 INFO  org.apache.flink.runtime.dispatcher.StandaloneDispatcher     [] - Submitting job a63f806ba9a172b728395266a6dc41fe (Flink Streaming Job). 2021-05-10 14:09:00,524 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcService     [] - Starting RPC endpoint for org.apache.flink.runtime.jobmaster.JobMaster at akka://flink/user/rpc/jobmanager_2 . 2021-05-10 14:09:00,537 INFO  org.apache.flink.runtime.jobmaster.JobMaster     [] - Initializing job Flink Streaming Job (a63f806ba9a172b728395266a6dc41fe). 2021-05-10 14:09:00,612 INFO  org.apache.flink.runtime.jobmaster.JobMaster     [] - Using restart back off time strategy NoRestartBackoffTimeStrategy for Flink Streaming Job (a63f806ba9a172b728395266a6dc41fe). 2021-05-10 14:09:00,665 INFO  org.apache.flink.runtime.jobmaster.JobMaster     [] - Running initialization on master for job Flink Streaming Job (a63f806ba9a172b728395266a6dc41fe). 2021-05-10 14:09:00,666 INFO  org.apache.flink.runtime.jobmaster.JobMaster     [] - Successfully ran initialization on master in 0 ms. 2021-05-10 14:09:00,707 INFO  org.apache.flink.runtime.scheduler.adapter.DefaultExecutionTopology [] - Built 1 pipelined regions in 15 ms 2021-05-10 14:09:00,742 INFO  org.apache.flink.runtime.jobmaster.JobMaster     [] - No state backend has been configured, using default (Memory / JobManager) MemoryStateBackend (data in heap memory / checkpoints to JobManager) (checkpoints: 'null', savepoints: 'null', asynchronous: TRUE, maxStateSize: 5242880) 2021-05-10 14:09:00,823 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator    [] - No checkpoint found during restore. 2021-05-10 14:09:00,830 INFO  org.apache.flink.runtime.jobmaster.JobMaster     [] - Using failover strategy org.apache.flink.runtime.executiongraph.failover.flip1.RestartPipelinedRegionFailoverStrategy@43519311 for Flink Streaming Job (a63f806ba9a172b728395266a6dc41fe). 2021-05-10 14:09:00,844 INFO  org.apache.flink.runtime.jobmaster.JobManagerRunnerImpl    [] - JobManager runner for job Flink Streaming Job (a63f806ba9a172b728395266a6dc41fe) was granted leadership with session id 00000000-0000-0000-0000-000000000000 at akka.tcp://flink@franz-01.default:6123/user/rpc/jobmanager_2. 2021-05-10 14:09:00,848 INFO  org.apache.flink.runtime.jobmaster.JobMaster     [] - Starting execution of job Flink Streaming Job (a63f806ba9a172b728395266a6dc41fe) under job master id 00000000000000000000000000000000. 2021-05-10 14:09:00,851 INFO  org.apache.flink.runtime.jobmaster.JobMaster     [] - Starting scheduling with scheduling strategy [org.apache.flink.runtime.scheduler.strategy.PipelinedRegionSchedulingStrategy] 2021-05-10 14:09:00,852 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph     [] - Job Flink Streaming Job (a63f806ba9a172b728395266a6dc41fe) switched from state CREATED to RUNNING. 2021-05-10 14:09:00,912 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph     [] - Source: Custom Source -> Filter -> Timestamps/Watermarks (1/1) (b10791bc97d1d772bd443abd92bf32c0) switched from CREATED to SCHEDULED. 2021-05-10 14:09:00,913 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph     [] - Window(ProcessingTimeSessionWindows(60000), ProcessingTimeTrigger, SessionAggregate, PassThroughWindowFunction) -> Sink: Unnamed (1/1) (9ee57af7f96b318d16fb0784a693b481) switched from CREATED to SCHEDULED. 2021-05-10 14:09:00,928 INFO  org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl     [] - Cannot serve slot request, no ResourceManager connected. Adding as pending request [SlotRequestId{045786d740e63f4a986dc2024be7b3fc}] 2021-05-10 14:09:00,939 INFO  org.apache.flink.runtime.jobmaster.JobMaster     [] - Connecting to ResourceManager akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*(00000000000000000000000000000000) 2021-05-10 14:09:00,947 INFO  org.apache.flink.runtime.jobmaster.JobMaster     [] - Resolved ResourceManager address, beginning registration 2021-05-10 14:09:00,949 INFO  org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - Registering job manager 00000000000000000000000000000...@akka.tcp://flink@franz-01.default:6123/user/rpc/jobmanager_2 for job a63f806ba9a172b728395266a6dc41fe. 2021-05-10 14:09:01,009 INFO  org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - Registered job manager 00000000000000000000000000000...@akka.tcp://flink@franz-01.default:6123/user/rpc/jobmanager_2 for job a63f806ba9a172b728395266a6dc41fe. 2021-05-10 14:09:01,016 INFO  org.apache.flink.runtime.jobmaster.JobMaster     [] - JobManager successfully registered at ResourceManager, leader id: 00000000000000000000000000000000. 2021-05-10 14:09:01,018 INFO  org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl     [] - Requesting new slot [SlotRequestId{045786d740e63f4a986dc2024be7b3fc}] and profile ResourceProfile{UNKNOWN} with allocation id be6a056136c6dec065af876bda1f6dd5 from resource manager. 2021-05-10 14:09:01,020 INFO  org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - Request slot with profile ResourceProfile{UNKNOWN} for job a63f806ba9a172b728395266a6dc41fe with allocation id be6a056136c6dec065af876bda1f6dd5. 2021-05-10 14:09:01,029 INFO  org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - Requesting new worker with resource spec WorkerResourceSpec {cpuCores=1.0, taskHeapSize=384.000mb (402653174 bytes), taskOffHeapSize=0 bytes, networkMemSize=128.000mb (134217730 bytes), managedMemSize=512.000mb (536870920 bytes)}, current pending count: 1. 2021-05-10 14:09:01,035 INFO  org.apache.flink.runtime.externalresource.ExternalResourceUtils [] - Enabled external resources: [] 2021-05-10 14:09:01,414 INFO  org.apache.flink.kubernetes.KubernetesResourceManagerDriver  [] - Creating new TaskManager pod with name franz-01-taskmanager-1-1 and resource <1728,1.0>. 2021-05-10 14:09:01,739 INFO  org.apache.flink.kubernetes.KubernetesResourceManagerDriver  [] - Pod franz-01-taskmanager-1-1 is created. 2021-05-10 14:09:01,807 INFO  org.apache.flink.kubernetes.KubernetesResourceManagerDriver  [] - Received new TaskManager pod: franz-01-taskmanager-1-1 2021-05-10 14:09:01,808 INFO  org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - Requested worker franz-01-taskmanager-1-1 with resource spec WorkerResourceSpec {cpuCores=1.0, taskHeapSize=384.000mb (402653174 bytes), taskOffHeapSize=0 bytes, networkMemSize=128.000mb (134217730 bytes), managedMemSize=512.000mb (536870920 bytes)}./*

*
*Help appreciated. Thanks !
*
*
*
//
**


Reply via email to