[ https://issues.apache.org/jira/browse/FLINK-10818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16679502#comment-16679502 ]
Till Rohrmann commented on FLINK-10818: --------------------------------------- Could you check whether your Yarn cluster had actually the required resources? If you have other jobs running in your cluster, then it could happen that they take the required resources. Moreover, you could check whether the problem also occurs with Flink {{1.6.2}} and the new mode (not legacy). > RestartStrategies.fixedDelayRestart Occur NoResourceAvailableException: Not > enough free slots available to run the job. > ------------------------------------------------------------------------------------------------------------------------ > > Key: FLINK-10818 > URL: https://issues.apache.org/jira/browse/FLINK-10818 > Project: Flink > Issue Type: Bug > Components: Core > Affects Versions: 1.6.2 > Environment: JDK 1.8 > Flink 1.6.0 > Hadoop 2.7.3 > Reporter: ambition > Priority: Major > > Our Online Flink on Yarn environment operation job,code set restart tactic > like > {code:java} > exeEnv.setRestartStrategy(RestartStrategies.fixedDelayRestart(5,1000l)); > {code} > But job running some days, Occur Exception is : > {code:java} > org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: > Not enough free slots available to run the job. You can decrease the operator > parallelism or increase the number of slots per TaskManager in the > configuration. Task to schedule: < Attempt #5 (Source: KafkaJsonTableSource > -> Map -> where: (AND(OR(=(app_key, _UTF-16LE'C4FAF9CE1569F541'), =(app_key, > _UTF-16LE'F5C7F68C7117630B'), =(app_key, _UTF-16LE'57C6FF4B5A064D29')), > OR(=(LOWER(TRIM(FLAG(BOTH), _UTF-16LE' ', os_type)), _UTF-16LE'ios'), > =(LOWER(TRIM(FLAG(BOTH), _UTF-16LE' ', os_type)), _UTF-16LE'android')), IS > NOT NULL(server_id))), select: (MT_Date_Format_Mode(receive_time, > _UTF-16LE'yyyyMMddHHmm', 10) AS date_p, LOWER(TRIM(FLAG(BOTH), _UTF-16LE' ', > os_type)) AS os_type, MT_Date_Format_Mode(receive_time, _UTF-16LE'HHmm', 10) > AS date_mm, server_id) (1/6)) @ (unassigned) - [SCHEDULED] > with groupID < > cbc357ccb763df2852fee8c4fc7d55f2 > in sharing group < > 690dbad267a8ff37c8cb5e9dbedd0a6d >. Resources available to scheduler: Number > of instances=6, total number of slots=6, available slots=0 > at > org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:281) > at > org.apache.flink.runtime.jobmanager.scheduler.Scheduler.allocateSlot(Scheduler.java:155) > at > org.apache.flink.runtime.executiongraph.Execution.lambda$allocateAndAssignSlotForExecution$2(Execution.java:491) > at > org.apache.flink.runtime.executiongraph.Execution$$Lambda$44/1664178385.apply(Unknown > Source) > at > java.util.concurrent.CompletableFuture.uniComposeStage(CompletableFuture.java:981) > at > java.util.concurrent.CompletableFuture.thenCompose(CompletableFuture.java:2116) > at > org.apache.flink.runtime.executiongraph.Execution.allocateAndAssignSlotForExecution(Execution.java:489) > at > org.apache.flink.runtime.executiongraph.ExecutionJobVertex.allocateResourcesForAll(ExecutionJobVertex.java:521) > at > org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleEager(ExecutionGraph.java:945) > at > org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:875) > at > org.apache.flink.runtime.executiongraph.ExecutionGraph.restart(ExecutionGraph.java:1262) > at > org.apache.flink.runtime.executiongraph.restart.ExecutionGraphRestartCallback.triggerFullRecovery(ExecutionGraphRestartCallback.java:59) > at > org.apache.flink.runtime.executiongraph.restart.FixedDelayRestartStrategy$1.run(FixedDelayRestartStrategy.java:68) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > > this Exception happened when the job started. issue links to > https://issues.apache.org/jira/browse/FLINK-4486 > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)