Pulling in Yang Wang who may shed some light on the matter.
You could also have a look at
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Native-kubernetes-setup-failed-to-start-job-td39066.html;
while the issue was not actually resolved it may give some hints.
On 5/10/2021 4:40 PM, Valentin Wallyn wrote:
Hi,
I'm trying to use Flink on native kubernetes
(https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/resource-providers/native_kubernetes/
<https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/resource-providers/native_kubernetes/>)
but I have an error even with the example from the documentation.
The job get submitted but stays in "created" status until it timeouts
after 5 minutes. In the log of the task manager, I can see that the
error is "*Could not resolve ResourceManager address"*
*
*
What can be the issue ?
Here are the logs :
> ./bin/kubernetes-session.sh -Dkubernetes.cluster-id=franz-01
/2021-05-10 16:05:00,392 INFO
org.apache.flink.configuration.GlobalConfiguration [] -
Loading configuration property: jobmanager.rpc.address, localhost
2021-05-10 16:05:00,395 INFO
org.apache.flink.configuration.GlobalConfiguration [] - Loading
configuration property: jobmanager.rpc.port, 6123
2021-05-10 16:05:00,395 INFO
org.apache.flink.configuration.GlobalConfiguration [] - Loading
configuration property: jobmanager.memory.process.size, 1600m
2021-05-10 16:05:00,395 INFO
org.apache.flink.configuration.GlobalConfiguration [] - Loading
configuration property: taskmanager.memory.process.size, 1728m
2021-05-10 16:05:00,395 INFO
org.apache.flink.configuration.GlobalConfiguration [] - Loading
configuration property: taskmanager.numberOfTaskSlots, 1
2021-05-10 16:05:00,395 INFO
org.apache.flink.configuration.GlobalConfiguration [] - Loading
configuration property: parallelism.default, 1
2021-05-10 16:05:00,396 INFO
org.apache.flink.configuration.GlobalConfiguration [] - Loading
configuration property: jobmanager.execution.failover-strategy, region
2021-05-10 16:05:00,432 INFO
org.apache.flink.client.deployment.DefaultClusterClientServiceLoader
[] - Could not load factory due to missing dependencies.
2021-05-10 16:05:02,680 INFO
org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils [] -
The derived from fraction jvm overhead memory (160.000mb (167772162
bytes)) is less than its min value 192.000mb (201326592 bytes), min
value will be used instead
2021-05-10 16:05:02,690 INFO
org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils [] -
The derived from fraction jvm overhead memory (172.800mb (181193935
bytes)) is less than its min value 192.000mb (201326592 bytes), min
value will be used instead
2021-05-10 16:05:02,699 INFO
org.apache.flink.kubernetes.utils.KubernetesUtils [] - Kubernetes
deployment requires a fixed port. Configuration blob.server.port will
be set to 6124
2021-05-10 16:05:02,700 INFO
org.apache.flink.kubernetes.utils.KubernetesUtils [] - Kubernetes
deployment requires a fixed port. Configuration taskmanager.rpc.port
will be set to 6122
2021-05-10 16:05:02,760 INFO
org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils [] -
The derived from fraction jvm overhead memory (160.000mb (167772162
bytes)) is less than its min value 192.000mb (201326592 bytes), min
value will be used instead
2021-05-10 16:05:05,440 INFO
org.apache.flink.kubernetes.KubernetesClusterDescriptor [] -
Create flink session cluster franz-01 successfully, JobManager Web
Interface: http://xxx:8081 <http://xxx:8081>/
*Task Manager logs*
*
*
/2021-05-10 14:09:05,463 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - -D
2021-05-10 14:09:05,464 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - taskmanager.memory.framework.off-heap.size=134217728b
2021-05-10 14:09:05,464 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - -D
2021-05-10 14:09:05,464 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - taskmanager.memory.network.max=134217730b
2021-05-10 14:09:05,464 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - -D
2021-05-10 14:09:05,464 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - taskmanager.memory.network.min=134217730b
2021-05-10 14:09:05,464 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - -D
2021-05-10 14:09:05,465 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - taskmanager.memory.framework.heap.size=134217728b
2021-05-10 14:09:05,465 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - -D
2021-05-10 14:09:05,465 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - taskmanager.memory.managed.size=536870920b
2021-05-10 14:09:05,465 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - -D
2021-05-10 14:09:05,465 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - taskmanager.cpu.cores=1.0
2021-05-10 14:09:05,465 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - -D
2021-05-10 14:09:05,465 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - taskmanager.memory.task.heap.size=402653174b
2021-05-10 14:09:05,466 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - -D
2021-05-10 14:09:05,466 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - taskmanager.memory.task.off-heap.size=0b
2021-05-10 14:09:05,466 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - -D
2021-05-10 14:09:05,467 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - taskmanager.memory.jvm-metaspace.size=268435456b
2021-05-10 14:09:05,467 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - -D
2021-05-10 14:09:05,467 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - taskmanager.memory.jvm-overhead.max=201326592b
2021-05-10 14:09:05,470 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - -D
2021-05-10 14:09:05,470 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - taskmanager.memory.jvm-overhead.min=201326592b
2021-05-10 14:09:05,470 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - --configDir
2021-05-10 14:09:05,470 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - /opt/flink/conf
2021-05-10 14:09:05,470 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - -Dtaskmanager.resource-id=franz-01-taskmanager-1-1
2021-05-10 14:09:05,471 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - -Djobmanager.memory.off-heap.size=134217728b
2021-05-10 14:09:05,471 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - -Djobmanager.memory.jvm-overhead.min=201326592b
2021-05-10 14:09:05,472 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - -Dweb.tmpdir=/tmp/flink-web-e60a7b21-4e2b-4b6c-a0ac-5b08816edcee
2021-05-10 14:09:05,472 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - -Djobmanager.memory.jvm-metaspace.size=268435456b
2021-05-10 14:09:05,472 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - -Djobmanager.memory.heap.size=1073741824b
2021-05-10 14:09:05,472 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - -Djobmanager.memory.jvm-overhead.max=201326592b
2021-05-10 14:09:05,472 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - Classpath:
/opt/flink/lib/flink-csv-1.12.2.jar:/opt/flink/lib/flink-json-1.12.2.jar:/opt/flink/lib/flink-shaded-zookeeper-3.4.14.jar:/opt/flink/lib/flink-table-blink_2.12-1.12.2.jar:/opt/flink/lib/flink-table_2.12-1.12.2.jar:/opt/flink/lib/log4j-1.2-api-2.12.1.jar:/opt/flink/lib/log4j-api-2.12.1.jar:/opt/flink/lib/log4j-core-2.12.1.jar:/opt/flink/lib/log4j-slf4j-impl-2.12.1.jar:/opt/flink/lib/flink-dist_2.12-1.12.2.jar:::
2021-05-10 14:09:05,472 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] -
--------------------------------------------------------------------------------
2021-05-10 14:09:05,475 INFO
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
[] - Registered UNIX signal handlers for [TERM, HUP, INT]
2021-05-10 14:09:05,510 INFO
org.apache.flink.configuration.GlobalConfiguration [] - Loading
configuration property: blob.server.port, 6124
2021-05-10 14:09:05,511 INFO
org.apache.flink.configuration.GlobalConfiguration [] - Loading
configuration property: taskmanager.memory.process.size, 1728m
2021-05-10 14:09:05,511 INFO
org.apache.flink.configuration.GlobalConfiguration [] - Loading
configuration property:
kubernetes.internal.jobmanager.entrypoint.class,
org.apache.flink.kubernetes.entrypoint.KubernetesSessionClusterEntrypoint
2021-05-10 14:09:05,513 INFO
org.apache.flink.configuration.GlobalConfiguration [] - Loading
configuration property: jobmanager.execution.failover-strategy, region
2021-05-10 14:09:05,514 INFO
org.apache.flink.configuration.GlobalConfiguration [] - Loading
configuration property: jobmanager.rpc.address, franz-01.default
2021-05-10 14:09:05,514 INFO
org.apache.flink.configuration.GlobalConfiguration [] - Loading
configuration property: execution.target, kubernetes-session
2021-05-10 14:09:05,515 INFO
org.apache.flink.configuration.GlobalConfiguration [] - Loading
configuration property: jobmanager.memory.process.size, 1600m
2021-05-10 14:09:05,516 INFO
org.apache.flink.configuration.GlobalConfiguration [] - Loading
configuration property: jobmanager.rpc.port, 6123
2021-05-10 14:09:05,516 INFO
org.apache.flink.configuration.GlobalConfiguration [] - Loading
configuration property: kubernetes.cluster-id, franz-01
2021-05-10 14:09:05,516 INFO
org.apache.flink.configuration.GlobalConfiguration [] - Loading
configuration property: taskmanager.rpc.port, 6122
2021-05-10 14:09:05,517 INFO
org.apache.flink.configuration.GlobalConfiguration [] - Loading
configuration property: internal.cluster.execution-mode, NORMAL
2021-05-10 14:09:05,517 INFO
org.apache.flink.configuration.GlobalConfiguration [] - Loading
configuration property: parallelism.default, 1
2021-05-10 14:09:05,519 INFO
org.apache.flink.configuration.GlobalConfiguration [] - Loading
configuration property: taskmanager.numberOfTaskSlots, 1
2021-05-10 14:09:05,658 INFO org.apache.flink.core.fs.FileSystem
[] - Hadoop is not in the classpath/dependencies. The extended set of
supported File Systems via Hadoop is not available.
2021-05-10 14:09:05,733 INFO
org.apache.flink.runtime.security.modules.HadoopModuleFactory [] -
Cannot create Hadoop Security Module because Hadoop cannot be found in
the Classpath.
2021-05-10 14:09:05,738 INFO
org.apache.flink.runtime.security.modules.JaasModule [] - Jaas
file will be created as /tmp/jaas-3361029581556571704.conf.
2021-05-10 14:09:05,744 INFO
org.apache.flink.runtime.security.contexts.HadoopSecurityContextFactory
[] - Cannot install HadoopSecurityContext because Hadoop cannot be
found in the Classpath.
2021-05-10 14:09:05,811 INFO
org.apache.flink.configuration.Configuration [] - Config uses
fallback configuration key 'jobmanager.rpc.address' instead of key
'rest.address'
2021-05-10 14:09:05,855 INFO
org.apache.flink.runtime.util.LeaderRetrievalUtils [] - Trying to
select the network interface and address to use by connecting to the
leading JobManager.
2021-05-10 14:09:05,855 INFO
org.apache.flink.runtime.util.LeaderRetrievalUtils [] -
TaskManager will try to connect for PT10S before falling back to
heuristics
2021-05-10 14:09:26,116 WARN
org.apache.flink.runtime.net.ConnectionUtils [] - Could not
connect to franz-01.default:6123. Selecting a local address using
heuristics.
2021-05-10 14:09:26,116 WARN
org.apache.flink.runtime.net.ConnectionUtils [] - Could not find
any IPv4 address that is not loopback or link-local. Using localhost
address.
2021-05-10 14:09:26,117 INFO
org.apache.flink.runtime.taskexecutor.TaskManagerRunner [] -
TaskManager will use hostname/address 'franz-01-taskmanager-1-1'
(10.2.2.37) for communication.
2021-05-10 14:09:26,136 INFO
org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils [] - Trying
to start actor system, external address 10.2.2.37:6122
<http://10.2.2.37:6122>, bind address 0.0.0.0:6122 <http://0.0.0.0:6122>.
2021-05-10 14:09:27,212 INFO akka.event.slf4j.Slf4jLogger
[] - Slf4jLogger started
2021-05-10 14:09:27,283 INFO akka.remote.Remoting
[] - Starting remoting
2021-05-10 14:09:27,586 INFO akka.remote.Remoting
[] - Remoting started; listening on addresses
:[akka.tcp://flink@10.2.2.37:6122 <http://flink@10.2.2.37:6122>]
2021-05-10 14:09:27,730 INFO
org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils [] - Actor
system started at akka.tcp://flink@10.2.2.37:6122
<http://flink@10.2.2.37:6122>
2021-05-10 14:09:27,781 INFO
org.apache.flink.runtime.metrics.MetricRegistryImpl [] - No
metrics reporter configured, no metrics will be exposed/reported.
2021-05-10 14:09:27,786 INFO
org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils [] - Trying
to start actor system, external address 10.2.2.37:0
<http://10.2.2.37:0>, bind address 0.0.0.0:0 <http://0.0.0.0:0>.
2021-05-10 14:09:27,814 INFO akka.event.slf4j.Slf4jLogger
[] - Slf4jLogger started
2021-05-10 14:09:27,819 INFO akka.remote.Remoting
[] - Starting remoting
2021-05-10 14:09:27,881 INFO akka.remote.Remoting
[] - Remoting started; listening on addresses
:[akka.tcp://flink-metrics@10.2.2.37:39177
<http://flink-metrics@10.2.2.37:39177>]
2021-05-10 14:09:27,895 INFO
org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils [] - Actor
system started at akka.tcp://flink-metrics@10.2.2.37:39177
<http://flink-metrics@10.2.2.37:39177>
2021-05-10 14:09:27,916 INFO
org.apache.flink.runtime.rpc.akka.AkkaRpcService [] - Starting
RPC endpoint for
org.apache.flink.runtime.metrics.dump.MetricQueryService at
akka://flink-metrics/user/rpc/MetricQueryService_franz-01-taskmanager-1-1
.
2021-05-10 14:09:27,931 INFO
org.apache.flink.runtime.blob.PermanentBlobCache [] - Created
BLOB cache storage directory
/tmp/blobStore-16255e13-c39a-442f-853a-cd1e331e7325
2021-05-10 14:09:27,934 INFO
org.apache.flink.runtime.blob.TransientBlobCache [] - Created
BLOB cache storage directory
/tmp/blobStore-5ac02374-808a-4529-b80c-088dbeac2711
2021-05-10 14:09:27,955 INFO
org.apache.flink.runtime.externalresource.ExternalResourceUtils [] -
Enabled external resources: []
2021-05-10 14:09:27,955 INFO
org.apache.flink.runtime.externalresource.ExternalResourceUtils [] -
Enabled external resources: []
2021-05-10 14:09:27,955 INFO
org.apache.flink.runtime.taskexecutor.TaskManagerRunner [] -
Starting TaskManager with ResourceID: franz-01-taskmanager-1-1
2021-05-10 14:09:27,990 INFO
org.apache.flink.runtime.taskexecutor.TaskManagerServices [] -
Temporary file directory '/tmp': total 48 GB, usable 38 GB (79.17% usable)
2021-05-10 14:09:27,997 INFO
org.apache.flink.runtime.io.disk.FileChannelManagerImpl [] -
FileChannelManager uses directory
/tmp/flink-io-c08780a7-90bd-4259-8f51-8a24d95c21df for spill files.
2021-05-10 14:09:28,059 INFO
org.apache.flink.runtime.io.network.netty.NettyConfig [] -
NettyConfig [server address: /0.0.0.0 <http://0.0.0.0>, server port:
0, ssl enabled: false, memory segment size (bytes): 32768, transport
type: AUTO, number of server threads: 1 (manual), number of client
threads: 1 (manual), server connect backlog: 0 (use Netty's default),
client connect timeout (sec): 120, send/receive buffer size (bytes): 0
(use Netty's default)]
2021-05-10 14:09:28,063 INFO
org.apache.flink.runtime.io.disk.FileChannelManagerImpl [] -
FileChannelManager uses directory
/tmp/flink-netty-shuffle-209ae6cc-6fd5-4c9e-b6df-acc675a6995c for
spill files.
2021-05-10 14:09:28,578 INFO
org.apache.flink.runtime.io.network.buffer.NetworkBufferPool [] -
Allocated 128 MB for network buffer pool (number of memory segments:
4096, bytes per segment: 32768).
2021-05-10 14:09:28,594 INFO
org.apache.flink.runtime.io.network.NettyShuffleEnvironment [] -
Starting the network environment and its components.
2021-05-10 14:09:28,789 INFO
org.apache.flink.runtime.io.network.netty.NettyClient [] -
Transport type 'auto': using EPOLL.
2021-05-10 14:09:28,791 INFO
org.apache.flink.runtime.io.network.netty.NettyClient [] -
Successful initialization (took 196 ms).
2021-05-10 14:09:28,796 INFO
org.apache.flink.runtime.io.network.netty.NettyServer [] -
Transport type 'auto': using EPOLL.
2021-05-10 14:09:28,892 INFO
org.apache.flink.runtime.io.network.netty.NettyServer [] -
Successful initialization (took 99 ms). Listening on SocketAddress
/0:0:0:0:0:0:0:0%0:40399.
2021-05-10 14:09:28,894 INFO
org.apache.flink.runtime.taskexecutor.KvStateService [] -
Starting the kvState service and its components.
2021-05-10 14:09:28,979 INFO
org.apache.flink.runtime.rpc.akka.AkkaRpcService [] - Starting
RPC endpoint for org.apache.flink.runtime.taskexecutor.TaskExecutor at
akka://flink/user/rpc/taskmanager_0 .
2021-05-10 14:09:29,002 INFO
org.apache.flink.runtime.taskexecutor.DefaultJobLeaderService [] -
Start job leader service.
2021-05-10 14:09:29,005 INFO
org.apache.flink.runtime.filecache.FileCache [] - User file cache
uses directory /tmp/flink-dist-cache-bc340200-15c9-4d0a-950a-f43469bdb58d
2021-05-10 14:09:29,055 INFO
org.apache.flink.runtime.taskexecutor.TaskExecutor [] -
Connecting to ResourceManager
akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*(00000000000000000000000000000000).
2021-05-10 14:09:29,276 WARN akka.remote.ReliableDeliverySupervisor
[] - Association with remote system
[akka.tcp://flink@franz-01.default:6123] has failed, address is now
gated for [50] ms. Reason: [Association failed with
[akka.tcp://flink@franz-01.default:6123]] Caused by:
[java.net.UnknownHostException: franz-01.default]
2021-05-10 14:09:29,289 INFO
org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Could not
resolve ResourceManager address
akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*,
retrying in 10000 ms: Could not connect to rpc endpoint under address
akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*.
2021-05-10 14:09:49,314 INFO
org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Could not
resolve ResourceManager address
akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*,
retrying in 10000 ms: Could not connect to rpc endpoint under address
akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*.
2021-05-10 14:09:59,325 WARN akka.remote.ReliableDeliverySupervisor
[] - Association with remote system
[akka.tcp://flink@franz-01.default:6123] has failed, address is now
gated for [50] ms. Reason: [Association failed with
[akka.tcp://flink@franz-01.default:6123]] Caused by:
[java.net.UnknownHostException: franz-01.default: Temporary failure in
name resolution]
2021-05-10 14:09:59,327 INFO
org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Could not
resolve ResourceManager address
akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*,
retrying in 10000 ms: Could not connect to rpc endpoint under address
akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*.
2021-05-10 14:10:19,365 INFO
org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Could not
resolve ResourceManager address
akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*,
retrying in 10000 ms: Could not connect to rpc endpoint under address
akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*.
2021-05-10 14:10:29,363 WARN akka.remote.ReliableDeliverySupervisor
[] - Association with remote system
[akka.tcp://flink@franz-01.default:6123] has failed, address is now
gated for [50] ms. Reason: [Association failed with
[akka.tcp://flink@franz-01.default:6123]] Caused by:
[java.net.UnknownHostException: franz-01.default: Temporary failure in
name resolution]
2021-05-10 14:10:29,385 INFO
org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Could not
resolve ResourceManager address
akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*,
retrying in 10000 ms: Could not connect to rpc endpoint under address
akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*.
2021-05-10 14:10:49,425 INFO
org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Could not
resolve ResourceManager address
akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*,
retrying in 10000 ms: Could not connect to rpc endpoint under address
akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*.
2021-05-10 14:10:59,423 WARN akka.remote.ReliableDeliverySupervisor
[] - Association with remote system
[akka.tcp://flink@franz-01.default:6123] has failed, address is now
gated for [50] ms. Reason: [Association failed with
[akka.tcp://flink@franz-01.default:6123]] Caused by:
[java.net.UnknownHostException: franz-01.default: Temporary failure in
name resolution]
2021-05-10 14:10:59,445 INFO
org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Could not
resolve ResourceManager address
akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*,
retrying in 10000 ms: Could not connect to rpc endpoint under address
akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*.
/
*Job Manager logs*
/
/
/2021-05-10 14:09:00,393 INFO
org.apache.flink.runtime.dispatcher.StandaloneDispatcher [] -
Received JobGraph submission a63f806ba9a172b728395266a6dc41fe (Flink
Streaming Job).
2021-05-10 14:09:00,395 INFO
org.apache.flink.runtime.dispatcher.StandaloneDispatcher [] -
Submitting job a63f806ba9a172b728395266a6dc41fe (Flink Streaming Job).
2021-05-10 14:09:00,524 INFO
org.apache.flink.runtime.rpc.akka.AkkaRpcService [] - Starting
RPC endpoint for org.apache.flink.runtime.jobmaster.JobMaster at
akka://flink/user/rpc/jobmanager_2 .
2021-05-10 14:09:00,537 INFO
org.apache.flink.runtime.jobmaster.JobMaster [] - Initializing
job Flink Streaming Job (a63f806ba9a172b728395266a6dc41fe).
2021-05-10 14:09:00,612 INFO
org.apache.flink.runtime.jobmaster.JobMaster [] - Using restart
back off time strategy NoRestartBackoffTimeStrategy for Flink
Streaming Job (a63f806ba9a172b728395266a6dc41fe).
2021-05-10 14:09:00,665 INFO
org.apache.flink.runtime.jobmaster.JobMaster [] - Running
initialization on master for job Flink Streaming Job
(a63f806ba9a172b728395266a6dc41fe).
2021-05-10 14:09:00,666 INFO
org.apache.flink.runtime.jobmaster.JobMaster [] - Successfully
ran initialization on master in 0 ms.
2021-05-10 14:09:00,707 INFO
org.apache.flink.runtime.scheduler.adapter.DefaultExecutionTopology
[] - Built 1 pipelined regions in 15 ms
2021-05-10 14:09:00,742 INFO
org.apache.flink.runtime.jobmaster.JobMaster [] - No state
backend has been configured, using default (Memory / JobManager)
MemoryStateBackend (data in heap memory / checkpoints to JobManager)
(checkpoints: 'null', savepoints: 'null', asynchronous: TRUE,
maxStateSize: 5242880)
2021-05-10 14:09:00,823 INFO
org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - No
checkpoint found during restore.
2021-05-10 14:09:00,830 INFO
org.apache.flink.runtime.jobmaster.JobMaster [] - Using failover
strategy
org.apache.flink.runtime.executiongraph.failover.flip1.RestartPipelinedRegionFailoverStrategy@43519311
for Flink Streaming Job (a63f806ba9a172b728395266a6dc41fe).
2021-05-10 14:09:00,844 INFO
org.apache.flink.runtime.jobmaster.JobManagerRunnerImpl [] -
JobManager runner for job Flink Streaming Job
(a63f806ba9a172b728395266a6dc41fe) was granted leadership with session
id 00000000-0000-0000-0000-000000000000 at
akka.tcp://flink@franz-01.default:6123/user/rpc/jobmanager_2.
2021-05-10 14:09:00,848 INFO
org.apache.flink.runtime.jobmaster.JobMaster [] - Starting
execution of job Flink Streaming Job
(a63f806ba9a172b728395266a6dc41fe) under job master id
00000000000000000000000000000000.
2021-05-10 14:09:00,851 INFO
org.apache.flink.runtime.jobmaster.JobMaster [] - Starting
scheduling with scheduling strategy
[org.apache.flink.runtime.scheduler.strategy.PipelinedRegionSchedulingStrategy]
2021-05-10 14:09:00,852 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Job
Flink Streaming Job (a63f806ba9a172b728395266a6dc41fe) switched from
state CREATED to RUNNING.
2021-05-10 14:09:00,912 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph [] -
Source: Custom Source -> Filter -> Timestamps/Watermarks (1/1)
(b10791bc97d1d772bd443abd92bf32c0) switched from CREATED to SCHEDULED.
2021-05-10 14:09:00,913 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph [] -
Window(ProcessingTimeSessionWindows(60000), ProcessingTimeTrigger,
SessionAggregate, PassThroughWindowFunction) -> Sink: Unnamed (1/1)
(9ee57af7f96b318d16fb0784a693b481) switched from CREATED to SCHEDULED.
2021-05-10 14:09:00,928 INFO
org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl [] -
Cannot serve slot request, no ResourceManager connected. Adding as
pending request [SlotRequestId{045786d740e63f4a986dc2024be7b3fc}]
2021-05-10 14:09:00,939 INFO
org.apache.flink.runtime.jobmaster.JobMaster [] - Connecting to
ResourceManager
akka.tcp://flink@franz-01.default:6123/user/rpc/resourcemanager_*(00000000000000000000000000000000)
2021-05-10 14:09:00,947 INFO
org.apache.flink.runtime.jobmaster.JobMaster [] - Resolved
ResourceManager address, beginning registration
2021-05-10 14:09:00,949 INFO
org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager
[] - Registering job manager
00000000000000000000000000000...@akka.tcp://flink@franz-01.default:6123/user/rpc/jobmanager_2
for job a63f806ba9a172b728395266a6dc41fe.
2021-05-10 14:09:01,009 INFO
org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager
[] - Registered job manager
00000000000000000000000000000...@akka.tcp://flink@franz-01.default:6123/user/rpc/jobmanager_2
for job a63f806ba9a172b728395266a6dc41fe.
2021-05-10 14:09:01,016 INFO
org.apache.flink.runtime.jobmaster.JobMaster [] - JobManager
successfully registered at ResourceManager, leader id:
00000000000000000000000000000000.
2021-05-10 14:09:01,018 INFO
org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl [] -
Requesting new slot [SlotRequestId{045786d740e63f4a986dc2024be7b3fc}]
and profile ResourceProfile{UNKNOWN} with allocation id
be6a056136c6dec065af876bda1f6dd5 from resource manager.
2021-05-10 14:09:01,020 INFO
org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager
[] - Request slot with profile ResourceProfile{UNKNOWN} for job
a63f806ba9a172b728395266a6dc41fe with allocation id
be6a056136c6dec065af876bda1f6dd5.
2021-05-10 14:09:01,029 INFO
org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager
[] - Requesting new worker with resource spec WorkerResourceSpec
{cpuCores=1.0, taskHeapSize=384.000mb (402653174 bytes),
taskOffHeapSize=0 bytes, networkMemSize=128.000mb (134217730 bytes),
managedMemSize=512.000mb (536870920 bytes)}, current pending count: 1.
2021-05-10 14:09:01,035 INFO
org.apache.flink.runtime.externalresource.ExternalResourceUtils [] -
Enabled external resources: []
2021-05-10 14:09:01,414 INFO
org.apache.flink.kubernetes.KubernetesResourceManagerDriver [] -
Creating new TaskManager pod with name franz-01-taskmanager-1-1 and
resource <1728,1.0>.
2021-05-10 14:09:01,739 INFO
org.apache.flink.kubernetes.KubernetesResourceManagerDriver [] - Pod
franz-01-taskmanager-1-1 is created.
2021-05-10 14:09:01,807 INFO
org.apache.flink.kubernetes.KubernetesResourceManagerDriver [] -
Received new TaskManager pod: franz-01-taskmanager-1-1
2021-05-10 14:09:01,808 INFO
org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager
[] - Requested worker franz-01-taskmanager-1-1 with resource spec
WorkerResourceSpec {cpuCores=1.0, taskHeapSize=384.000mb (402653174
bytes), taskOffHeapSize=0 bytes, networkMemSize=128.000mb (134217730
bytes), managedMemSize=512.000mb (536870920 bytes)}./*
*
*Help appreciated. Thanks !
*
*
*
//
**