Hi Yang Thanks again for your help so far. I tried your suggestion, still with no luck.
Attached are the logs, please let me know if there are more I should send. Best kevin On 2020/06/08 03:02:40, Yang Wang <d...@gmail.com<mailto:d...@gmail.com>> wrote: > Hi Kevin,> > > It may because the characters length limitation of K8s(no more than 63)[1].> > So the pod> > name could not be too long. I notice that you are using the client> > automatic generated> > cluster-id. It may cause problem and could you set a meaningful cluster-id> > for your Flink> > session? For example,> > > kubernetes-session.sh ... -Dkubernetes.cluster-id=my-flink-k8s-session> > > This behavior has been improved in Flink 1.11 to check the length in client> > side before submission.> > > If it still could not work, could you share your full command and> > jobmanager logs? It will help a lot> > to find the root cause.> > > > [1].> > https://stackoverflow.com/questions/50412837/kubernetes-label-name-63-character-limit> > > > Best,> > Yang> > > kb <ke...@comcast.com<mailto:ke...@comcast.com>> 于2020年6月6日周六 上午1:00写道:> > > > Thanks Yang for the suggestion, I have tried it and I'm still getting the> > > same exception. Is it possible its due to the null pod name? Operation:> > > [create] for kind: [Pod] with name: [null] in namespace: [default]> > > failed.> > >> > > Best,> > > kevin> > >> > >> > >> > > --> > > Sent from:> > > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/> > >> > Best, kevin
[REDACTED flink]$ kubectl create serviceaccount svc-flink serviceaccount/svc-flink created [REDACTED flink]$ kubectl create clusterrolebinding svc-flink-role-binding --clusterrole=cluster-admin --serviceaccount=default:svc-flink clusterrolebinding.rbac.authorization.k8s.io/svc-flink-role-binding created [REDACTED flink]$ ./flink-1.10.1/bin/kubernetes-session.sh -Dkubernetes.jobmanager.service-account=svc-flink -Dcontainerized.master.env.HTTP2_DISABLE=true -Dkubernetes.container.image=REDACTED/flink:1.10.1-scala_2.11-s3-3 -Dkubernetes.cluster-id=ledger-flink-session 2020-06-08 14:50:57,215 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.rpc.address, localhost 2020-06-08 14:50:57,216 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.rpc.port, 6123 2020-06-08 14:50:57,217 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.heap.size, 1024m 2020-06-08 14:50:57,217 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.memory.process.size, 1728m 2020-06-08 14:50:57,217 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.numberOfTaskSlots, 1 2020-06-08 14:50:57,217 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: parallelism.default, 1 2020-06-08 14:50:57,218 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.execution.failover-strategy, region 2020-06-08 14:50:58,542 INFO org.apache.flink.runtime.clusterframework.TaskExecutorProcessUtils - The derived from fraction jvm overhead memory (172.800mb (181193935 bytes)) is less than its min value 192.000mb (201326592 bytes), min value will be used instead 2020-06-08 14:50:58,550 INFO org.apache.flink.kubernetes.utils.KubernetesUtils - Kubernetes deployment requires a fixed port. Configuration blob.server.port will be set to 6124 2020-06-08 14:50:58,551 INFO org.apache.flink.kubernetes.utils.KubernetesUtils - Kubernetes deployment requires a fixed port. Configuration taskmanager.rpc.port will be set to 6122 2020-06-08 14:50:59,532 INFO org.apache.flink.kubernetes.KubernetesClusterDescriptor - Create flink session cluster ledger-flink-session successfully, JobManager Web Interface: REDACTED [REDACTED flink]$ kubectl get services | fgrep flink ledger-flink-session ClusterIP REDACTED <none> 8081/TCP,6123/TCP,6124/TCP 24s ledger-flink-session-rest LoadBalancer REDACTED <pending> 8081:32379/TCP 24s [REDACTED flink]$ kubectl get pods | fgrep flink ledger-flink-session-7bf95b68b5-tsfw4 1/1 Running 0 6s [REDACTED flink]$ nohup kubectl port-forward service/ledger-flink-session 8081 & [1] 16722 [REDACTED flink]$ nohup: ignoring input and appending output to ânohup.outâ [REDACTED flink]$ kubectl exec -it ledger-flink-session-7bf95b68b5-tsfw4 bash root@ledger-flink-session-7bf95b68b5-tsfw4:/opt/flink# cat log/jobmanager.log 2020-06-08 14:51:31,391 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - -------------------------------------------------------------------------------- 2020-06-08 14:51:31,393 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting KubernetesSessionClusterEntrypoint (Version: 1.10.1, Rev:c5915cf, Date:07.05.2020 @ 13:58:51 CST) 2020-06-08 14:51:31,393 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - OS current user: root 2020-06-08 14:51:31,393 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Current Hadoop/Kerberos user: <no hadoop dependency found> 2020-06-08 14:51:31,393 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - JVM: OpenJDK 64-Bit Server VM - Oracle Corporation - 1.8/25.252-b09 2020-06-08 14:51:31,393 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Maximum heap size: 409 MiBytes 2020-06-08 14:51:31,393 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - JAVA_HOME: /usr/local/openjdk-8 2020-06-08 14:51:31,394 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - No Hadoop Dependency available 2020-06-08 14:51:31,394 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - JVM Options: 2020-06-08 14:51:31,394 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - -Xms424m 2020-06-08 14:51:31,394 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - -Xmx424m 2020-06-08 14:51:31,394 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - -Dlog.file=/opt/flink/log/jobmanager.log 2020-06-08 14:51:31,394 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - -Dlogback.configurationFile=file:/opt/flink/conf/logback.xml 2020-06-08 14:51:31,394 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - -Dlog4j.configuration=file:/opt/flink/conf/log4j.properties 2020-06-08 14:51:31,394 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Program Arguments: (none) 2020-06-08 14:51:31,394 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Classpath: /opt/flink/lib/flink-table-blink_2.11-1.10.1.jar:/opt/flink/lib/flink-table_2.11-1.10.1.jar:/opt/flink/lib/log4j-1.2.17.jar:/opt/flink/lib/slf4j-log4j12-1.7.15.jar:/opt/flink/lib/flink-dist_2.11-1.10.1.jar::: 2020-06-08 14:51:31,395 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - -------------------------------------------------------------------------------- 2020-06-08 14:51:31,396 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Registered UNIX signal handlers for [TERM, HUP, INT] 2020-06-08 14:51:31,405 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: blob.server.port, 6124 2020-06-08 14:51:31,406 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.memory.process.size, 1728m 2020-06-08 14:51:31,406 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: kubernetes.internal.jobmanager.entrypoint.class, org.apache.flink.kubernetes.entrypoint.KubernetesSessionClusterEntrypoint 2020-06-08 14:51:31,406 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.execution.failover-strategy, region 2020-06-08 14:51:31,406 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.rpc.address, ledger-flink-session.default 2020-06-08 14:51:31,406 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: kubernetes.internal.service.id, 76db5ed3-a997-11ea-a79c-0238474ce95c 2020-06-08 14:51:31,406 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: execution.target, kubernetes-session 2020-06-08 14:51:31,407 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.rpc.port, 6123 2020-06-08 14:51:31,407 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: kubernetes.cluster-id, ledger-flink-session 2020-06-08 14:51:31,407 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.rpc.port, 6122 2020-06-08 14:51:31,407 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: containerized.master.env.HTTP2_DISABLE, true 2020-06-08 14:51:31,407 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: internal.cluster.execution-mode, NORMAL 2020-06-08 14:51:31,408 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: kubernetes.container.image, REDACTED/flink:1.10.1-scala_2.11-s3-3 2020-06-08 14:51:31,408 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: parallelism.default, 1 2020-06-08 14:51:31,408 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.numberOfTaskSlots, 1 2020-06-08 14:51:31,408 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: kubernetes.jobmanager.service-account, svc-flink 2020-06-08 14:51:31,408 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.heap.size, 1024m 2020-06-08 14:51:31,611 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting KubernetesSessionClusterEntrypoint. 2020-06-08 14:51:31,611 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Install default filesystem. 2020-06-08 14:51:31,711 INFO org.apache.flink.core.fs.FileSystem - Hadoop is not in the classpath/dependencies. The extended set of supported File Systems via Hadoop is not available. 2020-06-08 14:51:31,795 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Install security context. 2020-06-08 14:51:31,804 INFO org.apache.flink.runtime.security.modules.HadoopModuleFactory - Cannot create Hadoop Security Module because Hadoop cannot be found in the Classpath. 2020-06-08 14:51:31,812 INFO org.apache.flink.runtime.security.modules.JaasModule - Jaas file will be created as /tmp/jaas-4193678875133969299.conf. 2020-06-08 14:51:31,815 INFO org.apache.flink.runtime.security.SecurityUtils - Cannot install HadoopSecurityContext because Hadoop cannot be found in the Classpath. 2020-06-08 14:51:31,816 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Initializing cluster services. 2020-06-08 14:51:31,886 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils - Trying to start actor system at ledger-flink-session.default:6123 2020-06-08 14:51:32,968 INFO akka.event.slf4j.Slf4jLogger - Slf4jLogger started 2020-06-08 14:51:32,993 INFO akka.remote.Remoting - Starting remoting 2020-06-08 14:51:33,193 INFO akka.remote.Remoting - Remoting started; listening on addresses :[akka.tcp://flink@ledger-flink-session.default:6123] 2020-06-08 14:51:33,299 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils - Actor system started at akka.tcp://flink@ledger-flink-session.default:6123 2020-06-08 14:51:33,393 INFO org.apache.flink.configuration.Configuration - Config uses fallback configuration key 'jobmanager.rpc.address' instead of key 'rest.address' 2020-06-08 14:51:33,469 INFO org.apache.flink.runtime.blob.BlobServer - Created BLOB server storage directory /tmp/blobStore-a99afe14-0228-4ce4-908a-40916fed9148 2020-06-08 14:51:33,472 INFO org.apache.flink.runtime.blob.BlobServer - Started BLOB server at 0.0.0.0:6124 - max concurrent requests: 50 - max backlog: 1000 2020-06-08 14:51:33,483 INFO org.apache.flink.runtime.metrics.MetricRegistryImpl - No metrics reporter configured, no metrics will be exposed/reported. 2020-06-08 14:51:33,486 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils - Trying to start actor system at ledger-flink-session.default:0 2020-06-08 14:51:33,567 INFO akka.event.slf4j.Slf4jLogger - Slf4jLogger started 2020-06-08 14:51:33,576 INFO akka.remote.Remoting - Starting remoting 2020-06-08 14:51:33,583 INFO akka.remote.Remoting - Remoting started; listening on addresses :[akka.tcp://flink-metrics@ledger-flink-session.default:45069] 2020-06-08 14:51:33,595 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils - Actor system started at akka.tcp://flink-metrics@ledger-flink-session.default:45069 2020-06-08 14:51:33,604 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService - Starting RPC endpoint for org.apache.flink.runtime.metrics.dump.MetricQueryService at akka://flink-metrics/user/MetricQueryService . 2020-06-08 14:51:33,781 INFO org.apache.flink.runtime.dispatcher.FileArchivedExecutionGraphStore - Initializing FileArchivedExecutionGraphStore: Storage directory /tmp/executionGraphStore-b4cce8ce-d743-4aeb-9d26-93dcfd6da8b8, expiration time 3600000, maximum cache size 52428800 bytes. 2020-06-08 14:51:33,824 INFO org.apache.flink.configuration.Configuration - Config uses fallback configuration key 'jobmanager.rpc.address' instead of key 'rest.address' 2020-06-08 14:51:33,825 INFO org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Upload directory /tmp/flink-web-a2fb24fe-f8c0-4d79-bbd7-db57b8dd3796/flink-web-upload does not exist. 2020-06-08 14:51:33,826 INFO org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Created directory /tmp/flink-web-a2fb24fe-f8c0-4d79-bbd7-db57b8dd3796/flink-web-upload for file uploads. 2020-06-08 14:51:33,827 INFO org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Starting rest endpoint. 2020-06-08 14:51:34,271 INFO org.apache.flink.runtime.webmonitor.WebMonitorUtils - Determined location of main cluster component log file: /opt/flink/log/jobmanager.log 2020-06-08 14:51:34,271 INFO org.apache.flink.runtime.webmonitor.WebMonitorUtils - Determined location of main cluster component stdout file: /opt/flink/log/jobmanager.out 2020-06-08 14:51:34,600 INFO org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Rest endpoint listening at ledger-flink-session.default:8081 2020-06-08 14:51:34,601 INFO org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - http://ledger-flink-session.default:8081 was granted leadership with leaderSessionID=00000000-0000-0000-0000-000000000000 2020-06-08 14:51:34,667 INFO org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Web frontend listening at http://ledger-flink-session.default:8081. 2020-06-08 14:51:35,284 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService - Starting RPC endpoint for org.apache.flink.kubernetes.KubernetesResourceManager at akka://flink/user/resourcemanager . 2020-06-08 14:51:35,370 INFO org.apache.flink.runtime.clusterframework.TaskExecutorProcessUtils - The derived from fraction jvm overhead memory (172.800mb (181193935 bytes)) is less than its min value 192.000mb (201326592 bytes), min value will be used instead 2020-06-08 14:51:35,374 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: blob.server.port, 6124 2020-06-08 14:51:35,374 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.memory.process.size, 1728m 2020-06-08 14:51:35,374 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: kubernetes.internal.jobmanager.entrypoint.class, org.apache.flink.kubernetes.entrypoint.KubernetesSessionClusterEntrypoint 2020-06-08 14:51:35,374 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.execution.failover-strategy, region 2020-06-08 14:51:35,374 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.rpc.address, ledger-flink-session.default 2020-06-08 14:51:35,374 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: kubernetes.internal.service.id, 76db5ed3-a997-11ea-a79c-0238474ce95c 2020-06-08 14:51:35,374 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: execution.target, kubernetes-session 2020-06-08 14:51:35,374 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.rpc.port, 6123 2020-06-08 14:51:35,374 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: kubernetes.cluster-id, ledger-flink-session 2020-06-08 14:51:35,374 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.rpc.port, 6122 2020-06-08 14:51:35,375 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: containerized.master.env.HTTP2_DISABLE, true 2020-06-08 14:51:35,375 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: internal.cluster.execution-mode, NORMAL 2020-06-08 14:51:35,375 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: kubernetes.container.image, REDACTED/flink:1.10.1-scala_2.11-s3-3 2020-06-08 14:51:35,375 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: parallelism.default, 1 2020-06-08 14:51:35,375 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.numberOfTaskSlots, 1 2020-06-08 14:51:35,375 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: kubernetes.jobmanager.service-account, svc-flink 2020-06-08 14:51:35,375 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.heap.size, 1024m 2020-06-08 14:51:35,396 INFO org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess - Start SessionDispatcherLeaderProcess. 2020-06-08 14:51:35,399 INFO org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess - Recover all persisted job graphs. 2020-06-08 14:51:35,400 INFO org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess - Successfully recovered 0 persisted job graphs. 2020-06-08 14:51:35,405 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService - Starting RPC endpoint for org.apache.flink.runtime.dispatcher.StandaloneDispatcher at akka://flink/user/dispatcher . 2020-06-08 14:51:37,071 INFO org.apache.flink.kubernetes.KubernetesResourceManager - Recovered 0 pods from previous attempts, current attempt id is 1. 2020-06-08 14:51:37,090 INFO org.apache.flink.kubernetes.KubernetesResourceManager - ResourceManager akka.tcp://flink@ledger-flink-session.default:6123/user/resourcemanager was granted leadership with fencing token 00000000000000000000000000000000 2020-06-08 14:51:37,093 INFO org.apache.flink.runtime.resourcemanager.slotmanager.SlotManagerImpl - Starting the SlotManager. 2020-06-08 15:00:28,596 WARN org.apache.flink.runtime.webmonitor.handlers.JarRunHandler - Configuring the job submission via query parameters is deprecated. Please migrate to submitting a JSON request instead. 2020-06-08 15:00:29,102 INFO REDACTED.ledger.Flow - Using parameter file classpath:/application.properties 2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager - Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme zip 2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager - Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme par 2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager - Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme res 2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager - Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme tar 2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager - Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme sar 2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager - Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme tgz 2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager - Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme war 2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager - Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme tbz2 2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager - Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme file 2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager - Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme gz 2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager - Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme tmp 2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager - Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme ear 2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager - Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme ejb3 2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager - Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme jar 2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager - Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme bz2 2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager - Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme ram 2020-06-08 15:00:29,212 INFO REDACTED.VirtualFileSystemManager - Registering VFS REDACTED.ClasspathVirtualFileSystem@540bc652 for scheme classpath 2020-06-08 15:00:29,212 INFO REDACTED.VirtualFileSystemManager - Registering VFS REDACTED.UrlVirtualFileSystem@26312f62 for scheme http 2020-06-08 15:00:29,213 INFO REDACTED.VirtualFileSystemManager - Registering VFS REDACTED.UrlVirtualFileSystem@26312f62 for scheme https 2020-06-08 15:00:29,387 INFO REDACTED.ledger.Flow - Enabling externalized checkpointing 2020-06-08 15:00:29,389 INFO REDACTED.ledger.Flow - Configuring RocksDB Backend: s3://REDACTED/state 2020-06-08 15:00:29,484 INFO REDACTED.Redis - redis - (cluster=true), connecting to host=REDACTED, port=REDACTED 2020-06-08 15:00:30,187 INFO io.lettuce.core.EpollProvider - Starting with epoll library 2020-06-08 15:00:30,189 INFO io.lettuce.core.KqueueProvider - Starting without optional kqueue library 2020-06-08 15:00:32,687 INFO org.apache.flink.api.java.typeutils.TypeExtractor - class org.apache.flink.streaming.api.functions.source.TimestampedFileInputSplit does not contain a setter for field modificationTime 2020-06-08 15:00:32,687 INFO org.apache.flink.api.java.typeutils.TypeExtractor - Class class org.apache.flink.streaming.api.functions.source.TimestampedFileInputSplit cannot be used as a POJO type because not all fields are valid POJO fields, and must be processed as GenericType. Please read the Flink documentation on "Data Types & Serialization" for details of the effect on performance. 2020-06-08 15:00:33,414 INFO REDACTED.ledger.Flow - Job ID: 7b863a7272e6f37fc62b3300980d9db1 2020-06-08 15:00:37,519 INFO org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Received JobGraph submission ac0ba486c934fc663d01e347323c785a (REDACTED). 2020-06-08 15:00:37,520 INFO org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Submitting job ac0ba486c934fc663d01e347323c785a (REDACTED). 2020-06-08 15:00:37,570 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService - Starting RPC endpoint for org.apache.flink.runtime.jobmaster.JobMaster at akka://flink/user/jobmanager_0 . 2020-06-08 15:00:37,578 INFO org.apache.flink.runtime.jobmaster.JobMaster - Initializing job REDACTED (ac0ba486c934fc663d01e347323c785a). 2020-06-08 15:00:37,594 INFO org.apache.flink.runtime.jobmaster.JobMaster - Using restart back off time strategy FixedDelayRestartBackoffTimeStrategy(maxNumberRestartAttempts=10, backoffTimeMS=60000) for REDACTED (ac0ba486c934fc663d01e347323c785a). 2020-06-08 15:00:37,675 INFO org.apache.flink.runtime.jobmaster.JobMaster - Running initialization on master for job REDACTED (ac0ba486c934fc663d01e347323c785a). 2020-06-08 15:00:37,675 INFO org.apache.flink.runtime.jobmaster.JobMaster - Successfully ran initialization on master in 0 ms. 2020-06-08 15:00:37,701 INFO org.apache.flink.runtime.jobmaster.JobMaster - Using application-defined state backend: RocksDBStateBackend{checkpointStreamBackend=File State Backend (checkpoints: 's3://REDACTED/state', savepoints: 'null', asynchronous: UNDEFINED, fileStateThreshold: -1), localRocksDbDirectories=null, enableIncrementalCheckpointing=TRUE, numberOfTransferThreads=-1, writeBatchSize=-1} 2020-06-08 15:00:37,701 INFO org.apache.flink.runtime.jobmaster.JobMaster - Configuring application-defined state backend with job/cluster config 2020-06-08 15:00:37,706 INFO org.apache.flink.contrib.streaming.state.RocksDBStateBackend - Using predefined options: FLASH_SSD_OPTIMIZED. 2020-06-08 15:00:37,707 INFO org.apache.flink.contrib.streaming.state.RocksDBStateBackend - Using default options factory: DefaultConfigurableOptionsFactory{configuredOptions={}}. 2020-06-08 15:00:38,703 INFO org.apache.flink.runtime.executiongraph.failover.flip1.RestartPipelinedRegionStrategy - Start building failover regions. 2020-06-08 15:00:38,704 INFO org.apache.flink.runtime.executiongraph.failover.flip1.RestartPipelinedRegionStrategy - Created 1 failover regions. 2020-06-08 15:00:38,704 INFO org.apache.flink.runtime.jobmaster.JobMaster - Using failover strategy org.apache.flink.runtime.executiongraph.failover.flip1.RestartPipelinedRegionStrategy@601936a6 for REDACTED (ac0ba486c934fc663d01e347323c785a). 2020-06-08 15:00:38,706 INFO org.apache.flink.runtime.jobmaster.JobManagerRunnerImpl - JobManager runner for job REDACTED (ac0ba486c934fc663d01e347323c785a) was granted leadership with session id 00000000-0000-0000-0000-000000000000 at akka.tcp://flink@ledger-flink-session.default:6123/user/jobmanager_0. 2020-06-08 15:00:38,710 INFO org.apache.flink.runtime.jobmaster.JobMaster - Starting execution of job REDACTED (ac0ba486c934fc663d01e347323c785a) under job master id 00000000000000000000000000000000. 2020-06-08 15:00:38,771 INFO org.apache.flink.runtime.jobmaster.JobMaster - Starting scheduling with scheduling strategy [org.apache.flink.runtime.scheduler.strategy.EagerSchedulingStrategy] 2020-06-08 15:00:38,771 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Job REDACTED (ac0ba486c934fc663d01e347323c785a) switched from state CREATED to RUNNING. 2020-06-08 15:00:38,781 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Joiner (1/1) (10d1b92f690e3f4dae02fb806300a7d7) switched from CREATED to SCHEDULED. 2020-06-08 15:00:38,794 INFO org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl - Cannot serve slot request, no ResourceManager connected. Adding as pending request [SlotRequestId{9723c61fe85823a01c57b6b593d05662}] 2020-06-08 15:00:38,803 INFO org.apache.flink.runtime.jobmaster.JobMaster - Connecting to ResourceManager akka.tcp://flink@ledger-flink-session.default:6123/user/resourcemanager(00000000000000000000000000000000) 2020-06-08 15:00:38,808 INFO org.apache.flink.runtime.jobmaster.JobMaster - Resolved ResourceManager address, beginning registration 2020-06-08 15:00:38,808 INFO org.apache.flink.runtime.jobmaster.JobMaster - Registration at ResourceManager attempt 1 (timeout=100ms) 2020-06-08 15:00:38,810 INFO org.apache.flink.kubernetes.KubernetesResourceManager - Registering job manager 00000000000000000000000000000...@akka.tcp://flink@ledger-flink-session.default:6123/user/jobmanager_0 for job ac0ba486c934fc663d01e347323c785a. 2020-06-08 15:00:38,867 INFO org.apache.flink.kubernetes.KubernetesResourceManager - Registered job manager 00000000000000000000000000000...@akka.tcp://flink@ledger-flink-session.default:6123/user/jobmanager_0 for job ac0ba486c934fc663d01e347323c785a. 2020-06-08 15:00:38,870 INFO org.apache.flink.runtime.jobmaster.JobMaster - JobManager successfully registered at ResourceManager, leader id: 00000000000000000000000000000000. 2020-06-08 15:00:38,871 INFO org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl - Requesting new slot [SlotRequestId{9723c61fe85823a01c57b6b593d05662}] and profile ResourceProfile{UNKNOWN} from resource manager. 2020-06-08 15:00:38,872 INFO org.apache.flink.kubernetes.KubernetesResourceManager - Request slot with profile ResourceProfile{UNKNOWN} for job ac0ba486c934fc663d01e347323c785a with allocation id ef56d50dbc6cfa93973089506fc8244c. 2020-06-08 15:00:38,875 INFO org.apache.flink.kubernetes.KubernetesResourceManager - Starting new worker with resource profile, ResourceProfile{UNKNOWN} 2020-06-08 15:00:38,875 INFO org.apache.flink.kubernetes.KubernetesResourceManager - Requesting new TaskManager pod with <1728,1.0>. Number pending requests 1. 2020-06-08 15:00:38,877 INFO org.apache.flink.kubernetes.KubernetesResourceManager - TaskManager ledger-flink-session-taskmanager-1-1 will be started with TaskExecutorProcessSpec {cpuCores=1.0, frameworkHeapSize=128.000mb (134217728 bytes), frameworkOffHeapSize=128.000mb (134217728 bytes), taskHeapSize=384.000mb (402653174 bytes), taskOffHeapSize=0 bytes, networkMemSize=128.000mb (134217730 bytes), managedMemorySize=512.000mb (536870920 bytes), jvmMetaspaceSize=256.000mb (268435456 bytes), jvmOverheadSize=192.000mb (201326592 bytes)}. 2020-06-08 15:00:49,179 ERROR org.apache.flink.kubernetes.KubernetesResourceManager - Could not start TaskManager in pod ledger-flink-session-taskmanager-1-1. java.util.concurrent.CompletionException: io.fabric8.kubernetes.client.KubernetesClientException: Operation: [create] for kind: [Pod] with name: [null] in namespace: [default] failed. at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273) at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280) at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1643) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Operation: [create] for kind: [Pod] with name: [null] in namespace: [default] failed. at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64) at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:331) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:324) at org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.lambda$createTaskManagerPod$0(Fabric8FlinkKubeClient.java:184) at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640) ... 3 more Caused by: java.net.SocketTimeoutException: timeout at org.apache.flink.kubernetes.shadded.okio.Okio$4.newTimeoutException(Okio.java:232) at org.apache.flink.kubernetes.shadded.okio.AsyncTimeout.exit(AsyncTimeout.java:285) at org.apache.flink.kubernetes.shadded.okio.AsyncTimeout$2.read(AsyncTimeout.java:241) at org.apache.flink.kubernetes.shadded.okio.RealBufferedSource.indexOf(RealBufferedSource.java:354) at org.apache.flink.kubernetes.shadded.okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:226) at org.apache.flink.kubernetes.shadded.okhttp3.internal.http1.Http1Codec.readHeaderLine(Http1Codec.java:215) at org.apache.flink.kubernetes.shadded.okhttp3.internal.http1.Http1Codec.readResponseHeaders(Http1Codec.java:189) at org.apache.flink.kubernetes.shadded.okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.java:88) at org.apache.flink.kubernetes.shadded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at org.apache.flink.kubernetes.shadded.okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:45) at org.apache.flink.kubernetes.shadded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at org.apache.flink.kubernetes.shadded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at org.apache.flink.kubernetes.shadded.okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93) at org.apache.flink.kubernetes.shadded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at org.apache.flink.kubernetes.shadded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at org.apache.flink.kubernetes.shadded.okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) at org.apache.flink.kubernetes.shadded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at org.apache.flink.kubernetes.shadded.okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:126) at org.apache.flink.kubernetes.shadded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at org.apache.flink.kubernetes.shadded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at io.fabric8.kubernetes.client.utils.BackwardsCompatibilityInterceptor.intercept(BackwardsCompatibilityInterceptor.java:119) at org.apache.flink.kubernetes.shadded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at org.apache.flink.kubernetes.shadded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:68) at org.apache.flink.kubernetes.shadded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at org.apache.flink.kubernetes.shadded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at io.fabric8.kubernetes.client.utils.HttpClientUtils.lambda$createHttpClient$3(HttpClientUtils.java:110) at org.apache.flink.kubernetes.shadded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at org.apache.flink.kubernetes.shadded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at org.apache.flink.kubernetes.shadded.okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:254) at org.apache.flink.kubernetes.shadded.okhttp3.RealCall.execute(RealCall.java:92) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:411) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:372) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:241) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:798) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:328) ... 6 more Caused by: java.net.SocketException: Socket closed at java.net.SocketInputStream.read(SocketInputStream.java:204) at java.net.SocketInputStream.read(SocketInputStream.java:141) at sun.security.ssl.InputRecord.readFully(InputRecord.java:465) at sun.security.ssl.InputRecord.read(InputRecord.java:503) at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:990) at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:948) at sun.security.ssl.AppInputStream.read(AppInputStream.java:105) at org.apache.flink.kubernetes.shadded.okio.Okio$2.read(Okio.java:140) at org.apache.flink.kubernetes.shadded.okio.AsyncTimeout$2.read(AsyncTimeout.java:237) ... 39 more ... Repeats from Requesting new TaskManager pod with <1728,1.0>. Number pending requests 1