[ https://issues.apache.org/jira/browse/FLINK-24213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17412349#comment-17412349 ]
xmarker commented on FLINK-24213: --------------------------------- The dead lock seems occur in following scene: 1.When establishedConnection handle data failed ,call stack in netty bootstrap thread is: establishedConnection.onFailure() -> establishedConnection.close() -> hold establishedConnection.lock -> when channel closed : hold serverConnection.connectionLock 2.When client.shutdown() ,call stack in main thread is : serverConnection.close() -> hold serverConnection.connectionLock- > establishedConnection.close() -> hold establishedConnection.lock so ,two thread with different direction wants hold opposition's lock ,dead lock occurred. EstablishedConnection.lock only protect run and requestCount ,may be use lock is too heavy, Can EstablishedConnection.requestCount use AtomicLong and EstablishedConnection.run use AtomicBoolean to avoid dead lock? > Java deadlock in QueryableState ClientTest > ------------------------------------------ > > Key: FLINK-24213 > URL: https://issues.apache.org/jira/browse/FLINK-24213 > Project: Flink > Issue Type: Bug > Components: Runtime / Queryable State > Affects Versions: 1.15.0 > Reporter: Dawid Wysakowicz > Priority: Major > Labels: test-stability > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=23750&view=logs&j=d44f43ce-542c-597d-bf94-b0718c71e5e8&t=ed165f3f-d0f6-524b-5279-86f8ee7d0e2d&l=15476 > {code} > Found one Java-level deadlock: > Sep 08 11:12:50 ============================= > Sep 08 11:12:50 "Flink Test Client Event Loop Thread 0": > Sep 08 11:12:50 waiting to lock monitor 0x00007f4e380309c8 (object > 0x0000000086b2cd50, a java.lang.Object), > Sep 08 11:12:50 which is held by "main" > Sep 08 11:12:50 "main": > Sep 08 11:12:50 waiting to lock monitor 0x00007f4ea4004068 (object > 0x0000000086b2cf50, a java.lang.Object), > Sep 08 11:12:50 which is held by "Flink Test Client Event Loop Thread 0" > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)