[ https://issues.apache.org/jira/browse/FLINK-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15427927#comment-15427927 ]
ASF GitHub Bot commented on FLINK-4348: --------------------------------------- GitHub user beyond1920 opened a pull request: https://github.com/apache/flink/pull/2389 Jira FLINK-4348 this pull request aims to implement communication from ResourceManager to TaskManager which is jira 4348. There are mainly 3 logics initiated from RM to TM: 1. Heartbeat, RM use heartbeat to sync with TM's slot status 2. request slot, when RM decides to assign slot to JM, should first try to send request to TM for slot. TM can either accept or reject this request. 3. FailureNotify, if RM cannot keep contact with heartbeat for several times, it will mark TM failed. Besides in some corner cases, TM will be marked as invalid by cluster manager master(e.g. yarn master), but TM itself does not realize. RM should send failure notify to TM and TM can terminate itself You can merge this pull request into a Git repository by running: $ git pull https://github.com/alibaba/flink jira-4345 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/2389.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2389 ---- commit 937c57c93894d271a614131cc77f0a8a7e33ab37 Author: beyond1920 <beyond1...@126.com> Date: 2016-08-15T08:05:45Z from stephon's uncommitted pull request modified: flink-runtime/src/main/java/org/apache/flink/runtime/rpc/resourcemanager/ResourceManager.java new file: flink-runtime/src/main/java/org/apache/flink/runtime/rpc/resourcemanager/TaskExecutorRegistrationResponse.java modified: flink-runtime/src/main/java/org/apache/flink/runtime/rpc/resourcemanager/ResourceManager.java new file: flink-runtime/src/main/java/org/apache/flink/runtime/rpc/resourcemanager/TaskExecutorRegistrationResponse.java new file: flink-runtime/src/main/java/org/apache/flink/runtime/rpc/taskexecutor/SlotReport.java commit 073b65c35fed4e64faedaa653b7b01b532531990 Author: beyond1920 <beyond1...@126.com> Date: 2016-08-15T08:34:34Z request slot from slotManager and offer slot to slotManager new file: flink-runtime/src/main/java/org/apache/flink/runtime/clusterframework/types/AllocationJobID.java new file: flink-runtime/src/main/java/org/apache/flink/runtime/clusterframework/types/ResourceProfile.java modified: flink-runtime/src/main/java/org/apache/flink/runtime/rpc/resourcemanager/ResourceManager.java new file: flink-runtime/src/main/java/org/apache/flink/runtime/rpc/resourcemanager/SlotManager.java modified: flink-runtime/src/main/java/org/apache/flink/runtime/rpc/resourcemanager/SlotRequest.java new file: flink-runtime/src/main/java/org/apache/flink/runtime/rpc/resourcemanager/StandaloneSlotManager.java new file: flink-runtime/src/main/java/org/apache/flink/runtime/rpc/resourcemanager/YarnSlotManager.java new file: flink-runtime/src/main/java/org/apache/flink/runtime/rpc/taskexecutor/RequestSlotResponse.java modified: flink-runtime/src/main/java/org/apache/flink/runtime/rpc/taskexecutor/TaskExecutorGateway.java commit 5e50d0844f9b04f2bf89a5fc47cd0248d624ff6f Author: beyond1920 <beyond1...@126.com> Date: 2016-08-15T10:32:03Z hearbeat response from tm to rm commit ecf14e42a4626725f6a7aba28af4b2d0d83c2f18 Author: beyond1920 <beyond1...@126.com> Date: 2016-08-16T08:22:20Z update the heartbeat api and response message commit b815134933978620420874642df0e629ce35f8e6 Author: beyond1920 <beyond1...@126.com> Date: 2016-08-17T02:13:40Z Merge branch 'flip-6' of https://github.com/apache/flink into jira-4345 commit b55f7843af50e2a4ad85dbfdbb1fedc5985f1ff3 Author: beyond1920 <beyond1...@126.com> Date: 2016-08-17T02:29:39Z add heartbeat manager commit c838912c36381ee13c4c6871e6ea76092ee6e4fa Author: beyond1920 <beyond1...@126.com> Date: 2016-08-17T08:45:43Z update resourceManager and add LeaderContender subclass commit 196ac2bb5943b58bb12aceb214fa4168aafed29f Author: beyond1920 <beyond1...@126.com> Date: 2016-08-19T09:14:45Z add test to ResourceManagerToTaskExecutorHeartbeatScheduler ---- > implement communication from ResourceManager to TaskManager > ----------------------------------------------------------- > > Key: FLINK-4348 > URL: https://issues.apache.org/jira/browse/FLINK-4348 > Project: Flink > Issue Type: Sub-task > Components: Cluster Management > Reporter: Kurt Young > Assignee: zhangjing > > There are mainly 3 logics initiated from RM to TM: > * Heartbeat, RM use heartbeat to sync with TM's slot status > * SlotRequest, when RM decides to assign slot to JM, should first try to send > request to TM for slot. TM can either accept or reject this request. > * FailureNotify, in some corner cases, TM will be marked as invalid by > cluster manager master(e.g. yarn master), but TM itself does not realize. RM > should send failure notify to TM and TM can terminate itself -- This message was sent by Atlassian JIRA (v6.3.4#6332)