Ferenc Erdelyi created YARN-11590:
-------------------------------------

             Summary: RM process stuck after confStore.format() when ZK SSL/TLS 
is enabled,  as netty thread waits indefinitely
                 Key: YARN-11590
                 URL: https://issues.apache.org/jira/browse/YARN-11590
             Project: Hadoop YARN
          Issue Type: Bug
          Components: resourcemanager
            Reporter: Ferenc Erdelyi


YARN-11468 enabled Zookeeper SSL/TLS support for YARN.
Curator uses ClientCnxnSocketNetty for secured connection and the thread needs 
to be closed with confStore.close() after calling confStore.format() to avoid 
the netty thread to wait indefinitely, which renders the RM unresponsive after 
deleting the confstore when started with the "-format-conf-store" arg.

The unclosed thread, which keeps RM running:
{code:java}
2023-10-10 12:13:01,000 INFO 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: The 
Thread[main-SendThread(ferdelyi-1.ferdelyi.root.hwx.site:2182),5,main]TIMED_WAITING
 is stands at [sun.misc.Unsafe.park(Native Method), 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215), 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078),
 
java.util.concurrent.LinkedBlockingDeque.pollFirst(LinkedBlockingDeque.java:522),
 java.util.concurrent.LinkedBlockingDeque.poll(LinkedBlockingDeque.java:684), 
org.apache.zookeeper.ClientCnxnSocketNetty.doTransport(ClientCnxnSocketNetty.java:275),
 org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1289)]
{code}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to