[ 
https://issues.apache.org/jira/browse/IGNITE-28097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Petrov updated IGNITE-28097:
------------------------------------
    Labels: ise  (was: )

> Fixed unclosed socket if the client node stopped during a reconnect.
> --------------------------------------------------------------------
>
>                 Key: IGNITE-28097
>                 URL: https://issues.apache.org/jira/browse/IGNITE-28097
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Mikhail Petrov
>            Assignee: Mikhail Petrov
>            Priority: Minor
>              Labels: ise
>             Fix For: 2.19
>
>          Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Wee need to fix flaky CommunicationConnectionPoolMetricsTest see 
> https://ci2.ignite.apache.org/project.html?projectId=IgniteTests24Java8&testNameId=-8162810346300672703&tab=testDetails&branch_IgniteTests24Java8=__all_branches__
>  and tests with the same name but different parameters.
> Steps that lead to test hanging:
> 1. The cluster consists of server nodes (crd, srv) and one client node (cli). 
> srv is the router for cli.
> 2. srv is stopped, and cli is attempting to reconnect to another cluster node 
> (see org.apache.ignite.spi.discovery.tcp.ClientImpl.Reconnector).
> 3. During the reconnection process, cli is stopped. However, due to incorrect 
> exception handling, cli simply opens a socket to crd. 
> {code:java}
> org.apache.ignite.spi.IgniteSpiException: Wrong Ignite instance is set: null
> [16:24:39]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.spi.IgniteSpiAdapter$GridDummySpiContext.addTimeoutObject(IgniteSpiAdapter.java:958)
> [16:24:39]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.spi.IgniteSpiAdapter.addTimeoutObject(IgniteSpiAdapter.java:642)
> [16:24:39]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.startTimer(TcpDiscoverySpi.java:2451)
> [16:24:39]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.writeMessage(TcpDiscoverySpi.java:1756)
> [16:24:39]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.spi.discovery.tcp.TestTcpDiscoverySpi.writeMessage(TestTcpDiscoverySpi.java:62)
> [16:24:39]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.spi.discovery.tcp.ClientImpl.sendJoinRequest(ClientImpl.java:817)
> [16:24:39]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.spi.discovery.tcp.ClientImpl.sendJoinRequests(ClientImpl.java:646)
> [16:24:39]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.spi.discovery.tcp.ClientImpl.joinTopology(ClientImpl.java:608)
> [16:24:39]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.spi.discovery.tcp.ClientImpl$Reconnector.body(ClientImpl.java:1601)
> [16:24:39]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
> {code}
> As a result cli cannot also send a TcpDiscoveryNodeLeftMessage.
> 4. crd considers cli is reconnected and does not generate a NODE_LEFT event.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to