Nico Kruber created FLINK-8900: ---------------------------------- Summary: YARN FinalStatus always shows as KILLED with Flip-6 Key: FLINK-8900 URL: https://issues.apache.org/jira/browse/FLINK-8900 Project: Flink Issue Type: Bug Components: YARN Affects Versions: 1.5.0, 1.6.0 Reporter: Nico Kruber Fix For: 1.5.0, 1.6.0
Whenever I run a simple simple word count like this one on YARN with Flip-6 enabled, {code} ./bin/flink run -m yarn-cluster -yjm 768 -ytm 3072 -ys 2 -p 20 -c org.apache.flink.streaming.examples.wordcount.WordCount ./examples/streaming/WordCount.jar --input /usr/share/doc/rsync-3.0.6/COPYING {code} it will show up as {{KILLED}} in the {{State}} and {{FinalStatus}} columns even though the program ran successfully like this one (irrespective of FLINK-8899 occurring or not): {code} 2018-03-08 16:48:39,049 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Job Streaming WordCount (11a794d2f5dc2955d8015625ec300c20) switched from state RUNNING to FINISHED. 2018-03-08 16:48:39,050 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Stopping checkpoint coordinator for job 11a794d2f5dc2955d8015625ec300c20 2018-03-08 16:48:39,050 INFO org.apache.flink.runtime.checkpoint.StandaloneCompletedCheckpointStore - Shutting down 2018-03-08 16:48:39,078 INFO org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Job 11a794d2f5dc2955d8015625ec300c20 reached globally terminal state FINISHED. 2018-03-08 16:48:39,151 INFO org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager - Register TaskManager e58efd886429e8f080815ea74ddfa734 at the SlotManager. 2018-03-08 16:48:39,221 INFO org.apache.flink.runtime.jobmaster.JobMaster - Stopping the JobMaster for job Streaming WordCount(11a794d2f5dc2955d8015625ec300c20). 2018-03-08 16:48:39,270 INFO org.apache.flink.runtime.jobmaster.JobMaster - Close ResourceManager connection 43f725adaee14987d3ff99380701f52f: JobManager is shutting down.. 2018-03-08 16:48:39,270 INFO org.apache.flink.yarn.YarnResourceManager - Disconnect job manager 00000000000000000000000000000...@akka.tcp://fl...@ip-172-31-7-0.eu-west-1.compute.internal:34281/user/jobmanager_0 for job 11a794d2f5dc2955d8015625ec300c20 from the resource manager. 2018-03-08 16:48:39,349 INFO org.apache.flink.runtime.jobmaster.slotpool.SlotPool - Suspending SlotPool. 2018-03-08 16:48:39,349 INFO org.apache.flink.runtime.jobmaster.slotpool.SlotPool - Stopping SlotPool. 2018-03-08 16:48:39,349 INFO org.apache.flink.runtime.jobmaster.JobManagerRunner - JobManagerRunner already shutdown. 2018-03-08 16:48:39,775 INFO org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager - Register TaskManager 4e1fb6c8f95685e24b6a4cb4b71ffb92 at the SlotManager. 2018-03-08 16:48:39,846 INFO org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager - Register TaskManager b5bce0bdfa7fbb0f4a0905cc3ee1c233 at the SlotManager. 2018-03-08 16:48:39,876 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested. 2018-03-08 16:48:39,910 INFO org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager - Register TaskManager a35b0690fdc6ec38bbcbe18a965000fd at the SlotManager. 2018-03-08 16:48:39,942 INFO org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager - Register TaskManager 5175cabe428bea19230ac056ff2a17bb at the SlotManager. 2018-03-08 16:48:39,974 INFO org.apache.flink.runtime.blob.BlobServer - Stopped BLOB server at 0.0.0.0:46511 2018-03-08 16:48:39,975 INFO org.apache.flink.runtime.blob.TransientBlobCache - Shutting down BLOB cache {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)