environment: flinksql 1.12.2
k8s session mode description: I got follow error log when my kafka connector port was wrong >>>>> 2021-04-25 16:49:50 org.apache.kafka.common.errors.TimeoutException: Timeout of 60000ms expired before the position for partition filebeat_json_install_log-3 could be determined >>>>> I got follow error log when my kafka connector ip was wrong >>>>> 2021-04-25 20:12:53 org.apache.flink.runtime.JobException: Recovery is suppressed by NoRestartBackoffTimeStrategy at org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:118) at org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.getFailureHandlingResult(ExecutionFailureHandler.java:80) at org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskFailure(DefaultScheduler.java:233) at org.apache.flink.runtime.scheduler.DefaultScheduler.maybeHandleTaskFailure(DefaultScheduler.java:224) at org.apache.flink.runtime.scheduler.DefaultScheduler.updateTaskExecutionStateInternal(DefaultScheduler.java:215) at org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:669) at org.apache.flink.runtime.scheduler.SchedulerNG.updateTaskExecutionState(SchedulerNG.java:89) at org.apache.flink.runtime.jobmaster.JobMaster.updateTaskExecutionState(JobMaster.java:447) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:305) at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:212) at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:77) at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158) at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26) at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21) at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21) at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170) at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) at akka.actor.Actor$class.aroundReceive(Actor.scala:517) at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592) at akka.actor.ActorCell.invoke(ActorCell.scala:561) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258) at akka.dispatch.Mailbox.run(Mailbox.scala:225) at akka.dispatch.Mailbox.exec(Mailbox.scala:235) at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: org.apache.kafka.common.errors.TimeoutException: Timeout expired while fetching topic metadata >>>>> When the job was cancelled,there was follow error log: >>>>> 2021-04-25 08:53:41,115 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Job v2_ods_device_action_log (fcc451b8a521398b10e5b86153141fbf) switched from state CANCELLING to CANCELED. 2021-04-25 08:53:41,115 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Stopping checkpoint coordinator for job fcc451b8a521398b10e5b86153141fbf. 2021-04-25 08:53:41,115 INFO org.apache.flink.runtime.checkpoint.StandaloneCompletedCheckpointStore [] - Shutting down 2021-04-25 08:53:41,115 INFO org.apache.flink.runtime.checkpoint.CompletedCheckpoint [] - Checkpoint with ID 1 at 'oss://tanwan-datahub/test/flinksql/checkpoints/fcc451b8a521398b10e5b86153141fbf/chk-1' not discarded. 2021-04-25 08:53:41,115 INFO org.apache.flink.runtime.checkpoint.CompletedCheckpoint [] - Checkpoint with ID 2 at 'oss://tanwan-datahub/test/flinksql/checkpoints/fcc451b8a521398b10e5b86153141fbf/chk-2' not discarded. 2021-04-25 08:53:41,116 INFO org.apache.flink.runtime.checkpoint.CompletedCheckpoint [] - Checkpoint with ID 3 at 'oss://tanwan-datahub/test/flinksql/checkpoints/fcc451b8a521398b10e5b86153141fbf/chk-3' not discarded. 2021-04-25 08:53:41,137 INFO org.apache.flink.runtime.dispatcher.StandaloneDispatcher [] - Job fcc451b8a521398b10e5b86153141fbf reached globally terminal state CANCELED. 2021-04-25 08:53:41,148 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - Stopping the JobMaster for job v2_ods_device_action_log(fcc451b8a521398b10e5b86153141fbf). 2021-04-25 08:53:41,151 INFO org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl [] - Suspending SlotPool. 2021-04-25 08:53:41,151 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - Close ResourceManager connection 5bdeb8d0f65a90ecdfafd7f102050b19: JobManager is shutting down.. 2021-04-25 08:53:41,151 INFO org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl [] - Stopping SlotPool. 2021-04-25 08:53:41,151 INFO org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - Disconnect job manager 00000000000000000000000000000...@akka.tcp://flink@flink-jobmanager:6123/user/rpc/jobmanager_3 for job fcc451b8a521398b10e5b86153141fbf from the resource manager. 2021-04-25 08:53:41,178 INFO org.apache.flink.runtime.dispatcher.StandaloneDispatcher [] - Could not archive completed job v2_ods_device_action_log(fcc451b8a521398b10e5b86153141fbf) to the history server. java.util.concurrent.CompletionException: java.lang.ExceptionInInitializerError at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273) ~[?:1.8.0_265] at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280) [?:1.8.0_265] at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1643) [?:1.8.0_265] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_265] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_265] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_265] Caused by: java.lang.ExceptionInInitializerError at org.apache.flink.runtime.dispatcher.JsonResponseHistoryServerArchivist.lambda$archiveExecutionGraph$0(JsonResponseHistoryServerArchivist.java:55) ~[flink-dist_2.11-1.11.2.jar:1.11.2] at org.apache.flink.util.function.ThrowingRunnable.lambda$unchecked$0(ThrowingRunnable.java:50) ~[template-common-jar-0.0.1.jar:?] at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640) ~[?:1.8.0_265] ... 3 more Caused by: java.lang.IllegalStateException: zip file closed at java.util.zip.ZipFile.ensureOpen(ZipFile.java:686) ~[?:1.8.0_265] at java.util.zip.ZipFile.getEntry(ZipFile.java:315) ~[?:1.8.0_265] at java.util.jar.JarFile.getEntry(JarFile.java:240) ~[?:1.8.0_265] at sun.net.www.protocol.jar.URLJarFile.getEntry(URLJarFile.java:128) ~[?:1.8.0_265] at java.util.jar.JarFile.getJarEntry(JarFile.java:223) ~[?:1.8.0_265] at sun.misc.URLClassPath$JarLoader.getResource(URLClassPath.java:1054) ~[?:1.8.0_265] at sun.misc.URLClassPath.getResource(URLClassPath.java:249) ~[?:1.8.0_265] at java.net.URLClassLoader$1.run(URLClassLoader.java:366) ~[?:1.8.0_265] at java.net.URLClassLoader$1.run(URLClassLoader.java:363) ~[?:1.8.0_265] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_265] at java.net.URLClassLoader.findClass(URLClassLoader.java:362) ~[?:1.8.0_265] at java.lang.ClassLoader.loadClass(ClassLoader.java:418) ~[?:1.8.0_265] at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) ~[?:1.8.0_265] at java.lang.ClassLoader.loadClass(ClassLoader.java:351) ~[?:1.8.0_265] at java.lang.Class.forName0(Native Method) ~[?:1.8.0_265] at java.lang.Class.forName(Class.java:264) ~[?:1.8.0_265] at org.apache.logging.log4j.util.LoaderUtil.loadClass(LoaderUtil.java:168) ~[log4j-api-2.12.1.jar:2.12.1] at org.apache.logging.slf4j.Log4jLogger.createConverter(Log4jLogger.java:416) ~[log4j-slf4j-impl-2.12.1.jar:2.12.1] at org.apache.logging.slf4j.Log4jLogger.<init>(Log4jLogger.java:54) ~[log4j-slf4j-impl-2.12.1.jar:2.12.1] at org.apache.logging.slf4j.Log4jLoggerFactory.newLogger(Log4jLoggerFactory.java:39) ~[log4j-slf4j-impl-2.12.1.jar:2.12.1] at org.apache.logging.slf4j.Log4jLoggerFactory.newLogger(Log4jLoggerFactory.java:30) ~[log4j-slf4j-impl-2.12.1.jar:2.12.1] at org.apache.logging.log4j.spi.AbstractLoggerAdapter.getLogger(AbstractLoggerAdapter.java:54) ~[log4j-api-2.12.1.jar:2.12.1] at org.apache.logging.slf4j.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:30) ~[log4j-slf4j-impl-2.12.1.jar:2.12.1] at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:277) ~[template-common-jar-0.0.1.jar:?] at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:288) ~[template-common-jar-0.0.1.jar:?] at org.apache.flink.runtime.history.FsJobArchivist.<clinit>(FsJobArchivist.java:50) ~[flink-dist_2.11-1.11.2.jar:1.11.2] at org.apache.flink.runtime.dispatcher.JsonResponseHistoryServerArchivist.lambda$archiveExecutionGraph$0(JsonResponseHistoryServerArchivist.java:55) ~[flink-dist_2.11-1.11.2.jar:1.11.2] at org.apache.flink.util.function.ThrowingRunnable.lambda$unchecked$0(ThrowingRunnable.java:50) ~[template-common-jar-0.0.1.jar:?] at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640) ~[?:1.8.0_265] ... 3 more >>>>> And then I will get follow error log when I run a new job, unless I restart the cluster >>>>> 2021-04-25 08:54:06,711 INFO org.apache.flink.client.ClientUtils [] - Starting program (detached: true) 2021-04-25 08:54:06,715 INFO org.apache.flink.contrib.streaming.state.RocksDBStateBackend [] - Using predefined options: DEFAULT. 2021-04-25 08:54:06,715 INFO org.apache.flink.contrib.streaming.state.RocksDBStateBackend [] - Using default options factory: DefaultConfigurableOptionsFactory{configuredOptions={}}. 2021-04-25 08:54:06,722 ERROR org.apache.flink.runtime.webmonitor.handlers.JarRunHandler [] - Exception occurred in REST handler: Could not execute application. >>>>>