Sudipta Laha created CASSSIDECAR-390:
----------------------------------------

             Summary: Deadlock during JMX reconnection in sidecar
                 Key: CASSSIDECAR-390
                 URL: https://issues.apache.org/jira/browse/CASSSIDECAR-390
             Project: Sidecar for Apache Cassandra
          Issue Type: Bug
          Components: Rest API
            Reporter: Sudipta Laha


A specific condition in the sidecar causes a deadlock during JMX reconnection. 
This deadlock occurs in the ClientCommunicatorAdmin.restart method under the 
following scenario:

 
 * JMX connection undergoes reconnection.
 * A notification handler for connection status changes executes JMX calls 
during reconnection.
 * The JMX call fails due to an IOException.

 
 

As a result, any threads attempting to access the JMX connection are blocked.

 

{color:#000000}Here is a stack trace for the deadlocked thread:{color}

 
{code:java}
"JMX client heartbeat 5" #301 daemon prio=5 os_prio=0 cpu=516.04ms 
elapsed=414591.85s tid=0x00007f4f0473f030 nid=0x5bb1 in Object.wait()  
[0x00007f4d543fd000]   java.lang.Thread.State: WAITING (on object monitor)    
at java.lang.Object.wait([email protected]/Native Method)       - waiting on 
<no object reference available>    at 
java.lang.Object.wait([email protected]/Object.java:328)     at 
com.sun.jmx.remote.internal.ClientCommunicatorAdmin.restart([email protected]/ClientCommunicatorAdmin.java:107)
        - waiting to re-lock in wait() <0x00000006041e54e8> (a [I)      at 
com.sun.jmx.remote.internal.ClientCommunicatorAdmin.gotIOException([email protected]/ClientCommunicatorAdmin.java:59)
  at 
javax.management.remote.rmi.RMIConnector$RMIClientCommunicatorAdmin.gotIOException([email protected]/RMIConnector.java:1497)
       at 
javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.getAttribute([email protected]/RMIConnector.java:908)
 at 
javax.management.MBeanServerInvocationHandler.invoke([email protected]/MBeanServerInvocationHandler.java:273)
  at com.sun.proxy.$Proxy71.getTokens(Unknown Source)     at 
org.apache.cassandra.sidecar.cluster.CassandraAdapterDelegate.maybeGetTokens(CassandraAdapterDelegate.java:329)
      at 
org.apache.cassandra.sidecar.cluster.CassandraAdapterDelegate.newNodeSettingsFromJmx(CassandraAdapterDelegate.java:305)
      at 
org.apache.cassandra.sidecar.cluster.CassandraAdapterDelegate.jmxHealthCheck(CassandraAdapterDelegate.java:211)
      at 
org.apache.cassandra.sidecar.cluster.CassandraAdapterDelegate$JmxNotificationListener.handleNotification(CassandraAdapterDelegate.java:579)
  at 
org.apache.cassandra.sidecar.common.server.JmxClient.lambda$forwardNotification$0(JmxClient.java:242)
        at 
org.apache.cassandra.sidecar.common.server.JmxClient$$Lambda$1978/0x0000000800edb440.accept(Unknown
 Source)  at 
java.util.concurrent.ConcurrentHashMap$KeySetView.forEach([email protected]/ConcurrentHashMap.java:4698)
     at 
java.util.Collections$SetFromMap.forEach([email protected]/Collections.java:5581)
    at 
org.apache.cassandra.sidecar.common.server.JmxClient.forwardNotification(JmxClient.java:242)
 at 
org.apache.cassandra.sidecar.common.server.JmxClient.handleNotification(JmxClient.java:235)
  at 
javax.management.NotificationBroadcasterSupport.handleNotification([email protected]/NotificationBroadcasterSupport.java:275)
  at 
javax.management.NotificationBroadcasterSupport$SendNotifJob.run([email protected]/NotificationBroadcasterSupport.java:352)
    at 
javax.management.NotificationBroadcasterSupport$1.execute([email protected]/NotificationBroadcasterSupport.java:337)
   at 
javax.management.NotificationBroadcasterSupport.sendNotification([email protected]/NotificationBroadcasterSupport.java:248)
    at 
javax.management.remote.rmi.RMIConnector.sendNotification([email protected]/RMIConnector.java:442)
 at 
javax.management.remote.rmi.RMIConnector$RMIClientCommunicatorAdmin.doStart([email protected]/RMIConnector.java:1670)
      at 
com.sun.jmx.remote.internal.ClientCommunicatorAdmin.restart([email protected]/ClientCommunicatorAdmin.java:132)
        at 
com.sun.jmx.remote.internal.ClientCommunicatorAdmin.gotIOException([email protected]/ClientCommunicatorAdmin.java:59)
  at 
javax.management.remote.rmi.RMIConnector$RMIClientCommunicatorAdmin.gotIOException([email protected]/RMIConnector.java:1497)
       at 
com.sun.jmx.remote.internal.ClientCommunicatorAdmin$Checker.run([email protected]/ClientCommunicatorAdmin.java:204)
    at java.lang.Thread.run([email protected]/Thread.java:829) {code}
Stack trace for a blocked thread:

 
{code:java}
[vertx-blocked-thread-checker] io.vertx.core.impl.BlockedThreadChecker - Thread 
Thread[sidecar-internal-worker-pool-12,5,main] has been blocked for 44266107 
ms, time limit is 300000 msio.vertx.core.VertxException: Thread blocked    at 
java.lang.Object.wait(Native Method) ~[?:?]  at 
java.lang.Object.wait(Object.java:328) ~[?:?]        at 
com.sun.jmx.remote.internal.ClientCommunicatorAdmin.restart(ClientCommunicatorAdmin.java:107)
 ~[?:?] at 
com.sun.jmx.remote.internal.ClientCommunicatorAdmin.gotIOException(ClientCommunicatorAdmin.java:59)
 ~[?:?]   at 
javax.management.remote.rmi.RMIConnector$RMIClientCommunicatorAdmin.gotIOException(RMIConnector.java:1497)
 ~[?:?]    at 
javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.invoke(RMIConnector.java:1027)
 ~[?:?]   at 
javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:298)
 ~[?:?]   at com.sun.proxy.$Proxy89.importNewSSTables(Unknown Source) ~[?:?]    
  at 
org.apache.cassandra.sidecar.adapters.base.CassandraTableOperations.importNewSSTables(CassandraTableOperations.java:57)
 ~[adapters-base-1.0.0.111-aci-cassandra.jar:?]       at 
org.apache.cassandra.sidecar.utils.SSTableImporter.drainImportQueue(SSTableImporter.java:240)
 ~[cassandra-sidecar-1.0.0.111-aci-cassandra.jar:?]     at 
org.apache.cassandra.sidecar.utils.SSTableImporter.maybeDrainImportQueue(SSTableImporter.java:188)
 ~[cassandra-sidecar-1.0.0.111-aci-cassandra.jar:?]        at 
org.apache.cassandra.sidecar.utils.SSTableImporter.lambda$processPendingImports$1(SSTableImporter.java:170)
 ~[cassandra-sidecar-1.0.0.111-aci-cassandra.jar:?]       at 
org.apache.cassandra.sidecar.utils.SSTableImporter$$Lambda$1924/0x0000000800eac440.run(Unknown
 Source) ~[?:?]        at 
org.apache.cassandra.sidecar.concurrent.TaskExecutorPool.lambda$runBlocking$4(TaskExecutorPool.java:198)
 ~[cassandra-sidecar-1.0.0.111-aci-cassandra.jar:?]  at 
org.apache.cassandra.sidecar.concurrent.TaskExecutorPool$$Lambda$627/0x0000000800a57040.call(Unknown
 Source) ~[?:?]  at 
io.vertx.core.impl.ContextImpl.lambda$executeBlocking$0(ContextImpl.java:178) 
~[vertx-core-4.5.7.jar:4.5.7]  at 
io.vertx.core.impl.ContextImpl$$Lambda$465/0x0000000800978840.handle(Unknown 
Source) ~[?:?]  at 
io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:279) 
~[vertx-core-4.5.7.jar:4.5.7]  at 
io.vertx.core.impl.ContextImpl.lambda$internalExecuteBlocking$2(ContextImpl.java:210)
 ~[vertx-core-4.5.7.jar:4.5.7]  at 
io.vertx.core.impl.ContextImpl$$Lambda$466/0x0000000800979040.run(Unknown 
Source) ~[?:?]     at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 
~[?:?]       at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 
~[?:?]       at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
 ~[netty-common-4.1.111.Final.jar:4.1.111.Final]        at 
java.lang.Thread.run(Thread.java:829) ~[?:?] {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to