Ferenc Erdelyi created YARN-11590:
-------------------------------------
Summary: RM process stuck after confStore.format() when ZK SSL/TLS
is enabled, as netty thread waits indefinitely
Key: YARN-11590
URL: https://issues.apache.org/jira/browse/YARN-11590
Project: Hadoop YARN
Issue Type: Bug
Components: resourcemanager
Reporter: Ferenc Erdelyi
YARN-11468 enabled Zookeeper SSL/TLS support for YARN.
Curator uses ClientCnxnSocketNetty for secured connection and the thread needs
to be closed with confStore.close() after calling confStore.format() to avoid
the netty thread to wait indefinitely, which renders the RM unresponsive after
deleting the confstore when started with the "-format-conf-store" arg.
The unclosed thread, which keeps RM running:
{code:java}
2023-10-10 12:13:01,000 INFO
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: The
Thread[main-SendThread(ferdelyi-1.ferdelyi.root.hwx.site:2182),5,main]TIMED_WAITING
is stands at [sun.misc.Unsafe.park(Native Method),
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215),
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078),
java.util.concurrent.LinkedBlockingDeque.pollFirst(LinkedBlockingDeque.java:522),
java.util.concurrent.LinkedBlockingDeque.poll(LinkedBlockingDeque.java:684),
org.apache.zookeeper.ClientCnxnSocketNetty.doTransport(ClientCnxnSocketNetty.java:275),
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1289)]
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]