[ 
https://issues.apache.org/jira/browse/FLINK-8900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16408607#comment-16408607
 ] 

ASF GitHub Bot commented on FLINK-8900:
---------------------------------------

GitHub user tillrohrmann opened a pull request:

    https://github.com/apache/flink/pull/5741

    [FLINK-8900] [yarn] Properly unregister application from Yarn RM

    ## What is the purpose of the change
    
    Unregisters the Flink application from Yarn if the application is shut 
down. This is required to properly show the state and final status in the Yarn 
web UI.
    
    ## Verifying this change
    
    - Manually tested
    
    ## Does this pull request potentially affect one of the following parts:
    
      - Dependencies (does it add or upgrade a dependency): (no)
      - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
      - The serializers: (no)
      - The runtime per-record code paths (performance sensitive): (no)
      - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes)
      - The S3 file system connector: (no)
    
    ## Documentation
    
      - Does this pull request introduce a new feature? (no)
      - If yes, how is the feature documented? (not applicable)


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tillrohrmann/flink yarnApplicationStatus

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/5741.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5741
    
----
commit 71efa75973d066268bbb3533f29da05270ef24b2
Author: Till Rohrmann <trohrmann@...>
Date:   2018-03-21T20:48:19Z

    [hotfix] Log final status and exit code under lock

commit c10e100cbf09e602415ff72043b857a1e29daf66
Author: Till Rohrmann <trohrmann@...>
Date:   2018-03-21T21:14:58Z

    [hotfix] Add FutureUtils#composeAfterwards

commit 2072210eddbb13add2b3228fd08c8550075cdfc1
Author: Till Rohrmann <trohrmann@...>
Date:   2018-03-21T21:19:28Z

    [FLINK-8900] [yarn] Properly unregister application from Yarn RM

----


> YARN FinalStatus always shows as KILLED with Flip-6
> ---------------------------------------------------
>
>                 Key: FLINK-8900
>                 URL: https://issues.apache.org/jira/browse/FLINK-8900
>             Project: Flink
>          Issue Type: Bug
>          Components: YARN
>    Affects Versions: 1.5.0, 1.6.0
>            Reporter: Nico Kruber
>            Priority: Blocker
>              Labels: flip-6
>             Fix For: 1.5.0
>
>
> Whenever I run a simple simple word count like this one on YARN with Flip-6 
> enabled,
> {code}
> ./bin/flink run -m yarn-cluster -yjm 768 -ytm 3072 -ys 2 -p 20 -c 
> org.apache.flink.streaming.examples.wordcount.WordCount 
> ./examples/streaming/WordCount.jar --input /usr/share/doc/rsync-3.0.6/COPYING
> {code}
> it will show up as {{KILLED}} in the {{State}} and {{FinalStatus}} columns 
> even though the program ran successfully like this one (irrespective of 
> FLINK-8899 occurring or not):
> {code}
> 2018-03-08 16:48:39,049 INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph        - Job Streaming 
> WordCount (11a794d2f5dc2955d8015625ec300c20) switched from state RUNNING to 
> FINISHED.
> 2018-03-08 16:48:39,050 INFO  
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Stopping 
> checkpoint coordinator for job 11a794d2f5dc2955d8015625ec300c20
> 2018-03-08 16:48:39,050 INFO  
> org.apache.flink.runtime.checkpoint.StandaloneCompletedCheckpointStore  - 
> Shutting down
> 2018-03-08 16:48:39,078 INFO  
> org.apache.flink.runtime.dispatcher.StandaloneDispatcher      - Job 
> 11a794d2f5dc2955d8015625ec300c20 reached globally terminal state FINISHED.
> 2018-03-08 16:48:39,151 INFO  
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Register 
> TaskManager e58efd886429e8f080815ea74ddfa734 at the SlotManager.
> 2018-03-08 16:48:39,221 INFO  org.apache.flink.runtime.jobmaster.JobMaster    
>               - Stopping the JobMaster for job Streaming 
> WordCount(11a794d2f5dc2955d8015625ec300c20).
> 2018-03-08 16:48:39,270 INFO  org.apache.flink.runtime.jobmaster.JobMaster    
>               - Close ResourceManager connection 
> 43f725adaee14987d3ff99380701f52f: JobManager is shutting down..
> 2018-03-08 16:48:39,270 INFO  org.apache.flink.yarn.YarnResourceManager       
>               - Disconnect job manager 
> 00000000000000000000000000000...@akka.tcp://fl...@ip-172-31-7-0.eu-west-1.compute.internal:34281/user/jobmanager_0
>  for job 11a794d2f5dc2955d8015625ec300c20 from the resource manager.
> 2018-03-08 16:48:39,349 INFO  
> org.apache.flink.runtime.jobmaster.slotpool.SlotPool          - Suspending 
> SlotPool.
> 2018-03-08 16:48:39,349 INFO  
> org.apache.flink.runtime.jobmaster.slotpool.SlotPool          - Stopping 
> SlotPool.
> 2018-03-08 16:48:39,349 INFO  
> org.apache.flink.runtime.jobmaster.JobManagerRunner           - 
> JobManagerRunner already shutdown.
> 2018-03-08 16:48:39,775 INFO  
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Register 
> TaskManager 4e1fb6c8f95685e24b6a4cb4b71ffb92 at the SlotManager.
> 2018-03-08 16:48:39,846 INFO  
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Register 
> TaskManager b5bce0bdfa7fbb0f4a0905cc3ee1c233 at the SlotManager.
> 2018-03-08 16:48:39,876 INFO  
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint         - RECEIVED 
> SIGNAL 15: SIGTERM. Shutting down as requested.
> 2018-03-08 16:48:39,910 INFO  
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Register 
> TaskManager a35b0690fdc6ec38bbcbe18a965000fd at the SlotManager.
> 2018-03-08 16:48:39,942 INFO  
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Register 
> TaskManager 5175cabe428bea19230ac056ff2a17bb at the SlotManager.
> 2018-03-08 16:48:39,974 INFO  org.apache.flink.runtime.blob.BlobServer        
>               - Stopped BLOB server at 0.0.0.0:46511
> 2018-03-08 16:48:39,975 INFO  
> org.apache.flink.runtime.blob.TransientBlobCache              - Shutting down 
> BLOB cache
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to