Ferenc Erdelyi created YARN-11709:
-------------------------------------
Summary: NodeManager should be shut down or blacklisted when it
cannot run program "/var/lib/yarn-ce/bin/container-executor"
Key: YARN-11709
URL: https://issues.apache.org/jira/browse/YARN-11709
Project: Hadoop YARN
Issue Type: Improvement
Components: container-executor
Reporter: Ferenc Erdelyi
When NodeManager encounters the below "No such file or directory" error
reported against the "container-executor", it should give up participating in
the cluster as it is not capable to run any container, but just fail the jobs.
{code:java}
2023-01-18 10:08:10,600 WARN
org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Exit code
from container container_e159_1673543180101_9407_02_
000014 startLocalizer is : -1
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException:
java.io.IOException: Cannot run program
"/var/lib/yarn-ce/bin/container-executor": error=2, No such file or directory
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:183)
at
org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.startLocalizer(LinuxContainerExecutor.java:403)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.j
ava:1250)
Caused by: java.io.IOException: Cannot run program
"/var/lib/yarn-ce/bin/container-executor": error=2, No such file or directory
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]