[ https://issues.apache.org/jira/browse/FLINK-11177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ryantaocer updated FLINK-11177: ------------------------------- Description: * A NodeHealthManager module aims to monitor the health states of nodes including machine, and task manager workload. * It's beyond the general blacklist machenism since blacklist is an extreme case. * It provides runtime metrics to the scheduler in JM who can then more such as load balance, bad TMs skipping, and even slots scoring. was: * A NodeHealthManager module to monitor the status of nodes including machine states and task manager workload. * It scheduler can do more with th info. such as load balance, to avoid bad node, and even scoring the slots. > Node health manager > ------------------- > > Key: FLINK-11177 > URL: https://issues.apache.org/jira/browse/FLINK-11177 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination > Reporter: ryantaocer > Assignee: ryantaocer > Priority: Major > > * A NodeHealthManager module aims to monitor the health states of nodes > including machine, and task manager workload. > * It's beyond the general blacklist machenism since blacklist is an extreme > case. > * It provides runtime metrics to the scheduler in JM who can then more such > as load balance, bad TMs skipping, and even slots scoring. -- This message was sent by Atlassian JIRA (v7.6.3#76005)