[
https://issues.apache.org/jira/browse/IGNITE-28097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mikhail Petrov updated IGNITE-28097:
------------------------------------
Labels: ise (was: )
> Fixed unclosed socket if the client node stopped during a reconnect.
> --------------------------------------------------------------------
>
> Key: IGNITE-28097
> URL: https://issues.apache.org/jira/browse/IGNITE-28097
> Project: Ignite
> Issue Type: Bug
> Reporter: Mikhail Petrov
> Assignee: Mikhail Petrov
> Priority: Minor
> Labels: ise
> Fix For: 2.19
>
> Time Spent: 2h 10m
> Remaining Estimate: 0h
>
> Wee need to fix flaky CommunicationConnectionPoolMetricsTest see
> https://ci2.ignite.apache.org/project.html?projectId=IgniteTests24Java8&testNameId=-8162810346300672703&tab=testDetails&branch_IgniteTests24Java8=__all_branches__
> and tests with the same name but different parameters.
> Steps that lead to test hanging:
> 1. The cluster consists of server nodes (crd, srv) and one client node (cli).
> srv is the router for cli.
> 2. srv is stopped, and cli is attempting to reconnect to another cluster node
> (see org.apache.ignite.spi.discovery.tcp.ClientImpl.Reconnector).
> 3. During the reconnection process, cli is stopped. However, due to incorrect
> exception handling, cli simply opens a socket to crd.
> {code:java}
> org.apache.ignite.spi.IgniteSpiException: Wrong Ignite instance is set: null
> [16:24:39]W: [org.apache.ignite:ignite-core] at
> org.apache.ignite.spi.IgniteSpiAdapter$GridDummySpiContext.addTimeoutObject(IgniteSpiAdapter.java:958)
> [16:24:39]W: [org.apache.ignite:ignite-core] at
> org.apache.ignite.spi.IgniteSpiAdapter.addTimeoutObject(IgniteSpiAdapter.java:642)
> [16:24:39]W: [org.apache.ignite:ignite-core] at
> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.startTimer(TcpDiscoverySpi.java:2451)
> [16:24:39]W: [org.apache.ignite:ignite-core] at
> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.writeMessage(TcpDiscoverySpi.java:1756)
> [16:24:39]W: [org.apache.ignite:ignite-core] at
> org.apache.ignite.spi.discovery.tcp.TestTcpDiscoverySpi.writeMessage(TestTcpDiscoverySpi.java:62)
> [16:24:39]W: [org.apache.ignite:ignite-core] at
> org.apache.ignite.spi.discovery.tcp.ClientImpl.sendJoinRequest(ClientImpl.java:817)
> [16:24:39]W: [org.apache.ignite:ignite-core] at
> org.apache.ignite.spi.discovery.tcp.ClientImpl.sendJoinRequests(ClientImpl.java:646)
> [16:24:39]W: [org.apache.ignite:ignite-core] at
> org.apache.ignite.spi.discovery.tcp.ClientImpl.joinTopology(ClientImpl.java:608)
> [16:24:39]W: [org.apache.ignite:ignite-core] at
> org.apache.ignite.spi.discovery.tcp.ClientImpl$Reconnector.body(ClientImpl.java:1601)
> [16:24:39]W: [org.apache.ignite:ignite-core] at
> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
> {code}
> As a result cli cannot also send a TcpDiscoveryNodeLeftMessage.
> 4. crd considers cli is reconnected and does not generate a NODE_LEFT event.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)