Aswin M Prabhu created YARN-11718:
-------------------------------------
Summary: Provide config option to not shutdown NM if it is
decommissioned
Key: YARN-11718
URL: https://issues.apache.org/jira/browse/YARN-11718
Project: Hadoop YARN
Issue Type: New Feature
Components: resourcemanager
Reporter: Aswin M Prabhu
Currently, an NM cannot be started if it is marked as decommissioned on the RM
(in the exclude list) because RM sends a SHUTDOWN signal when NM tries to send
a heartbeat after starting up:
{code:java}
// Check if this node is a 'valid' node
if (!this.nodesListManager.isValidNode(host) &&
!isNodeInDecommissioning(nodeId)) {
String message =
"Disallowed NodeManager from " + host
+ ", Sending SHUTDOWN signal to the NodeManager.";
LOG.info(message);
response.setDiagnosticsMessage(message);
response.setNodeAction(NodeAction.SHUTDOWN);
return response;
} {code}
This couples the start/stop operations of the NM service very tightly with its
state in the RM making it difficult to manage large fleets of NMs independently
from the RM.
For example, after an NM OS upgrade, we will be able to start the NM,
recommission it, and then check for the state without worrying about the order
of the start/recommission operations (especially if we don't have control over
the start operation - which is the case in large companies where start
operation is part of the OS upgrade pipeline).
The patch will look something like this:
{code:java}
// Check if this node is a 'valid' node
if (!this.nodesListManager.isValidNode(host) &&
!isNodeInDecommissioning(nodeId) &&
!this.noNMShutdownForInvalidNodes) {
String message =
"Disallowed NodeManager from " + host
+ ", Sending SHUTDOWN signal to the NodeManager.";
LOG.info(message);
response.setDiagnosticsMessage(message);
response.setNodeAction(NodeAction.SHUTDOWN);
return response;
} {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]