[ 
https://issues.apache.org/jira/browse/IGNITE-28097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Petrov updated IGNITE-28097:
------------------------------------
    Description: 
Wee need to fix flaky CommunicationConnectionPoolMetricsTest see 
https://ci2.ignite.apache.org/project.html?projectId=IgniteTests24Java8&testNameId=-8162810346300672703&tab=testDetails&branch_IgniteTests24Java8=__all_branches__
 and tests with the same name but different parameters.

Steps that lead to test hanging:

1. The cluster consists of server nodes (crd, srv) and one client node (cli). 
srv is the router for cli.
2. srv is stopped, and cli is attempting to reconnect to another cluster node 
(see org.apache.ignite.spi.discovery.tcp.ClientImpl.Reconnector).
3. During the reconnection process, cli is stopped. However, due to incorrect 
exception handling, cli simply opens a socket to crd. 

{code:java}
org.apache.ignite.spi.IgniteSpiException: Wrong Ignite instance is set: null
[16:24:39]W:             [org.apache.ignite:ignite-core]        at 
org.apache.ignite.spi.IgniteSpiAdapter$GridDummySpiContext.addTimeoutObject(IgniteSpiAdapter.java:958)
[16:24:39]W:             [org.apache.ignite:ignite-core]        at 
org.apache.ignite.spi.IgniteSpiAdapter.addTimeoutObject(IgniteSpiAdapter.java:642)
[16:24:39]W:             [org.apache.ignite:ignite-core]        at 
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.startTimer(TcpDiscoverySpi.java:2451)
[16:24:39]W:             [org.apache.ignite:ignite-core]        at 
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.writeMessage(TcpDiscoverySpi.java:1756)
[16:24:39]W:             [org.apache.ignite:ignite-core]        at 
org.apache.ignite.spi.discovery.tcp.TestTcpDiscoverySpi.writeMessage(TestTcpDiscoverySpi.java:62)
[16:24:39]W:             [org.apache.ignite:ignite-core]        at 
org.apache.ignite.spi.discovery.tcp.ClientImpl.sendJoinRequest(ClientImpl.java:817)
[16:24:39]W:             [org.apache.ignite:ignite-core]        at 
org.apache.ignite.spi.discovery.tcp.ClientImpl.sendJoinRequests(ClientImpl.java:646)
[16:24:39]W:             [org.apache.ignite:ignite-core]        at 
org.apache.ignite.spi.discovery.tcp.ClientImpl.joinTopology(ClientImpl.java:608)
[16:24:39]W:             [org.apache.ignite:ignite-core]        at 
org.apache.ignite.spi.discovery.tcp.ClientImpl$Reconnector.body(ClientImpl.java:1601)
[16:24:39]W:             [org.apache.ignite:ignite-core]        at 
org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
{code}


As a result cli cannot also send a TcpDiscoveryNodeLeftMessage.


4. crd considers cli is reconnected and does not generate a NODE_LEFT event.

  was:
Wee need to fix flaky CommunicationConnectionPoolMetricsTest see 
https://ci2.ignite.apache.org/project.html?projectId=IgniteTests24Java8&testNameId=-8162810346300672703&tab=testDetails&branch_IgniteTests24Java8=__all_branches__
 and tests with the same name but different parameters.

Steps that lead to test hanging:

1. The cluster consists of server nodes (crd, srv) and one client node (cli). 
srv is the router for cli.
2. srv is stopped, and cli is attempting to reconnect to another cluster node 
(see org.apache.ignite.spi.discovery.tcp.ClientImpl.Reconnector).
3. During the reconnection process, cli is stopped. However, due to incorrect 
exception handling, cli simply opens a socket to crd. However, it cannot send a 
TcpDiscoveryNodeLeftMessage.
4. crd considers cli is reconnected and does not generate a NODE_LEFT event.


> Fix flaky CommunicationConnectionPoolMetricsTest
> ------------------------------------------------
>
>                 Key: IGNITE-28097
>                 URL: https://issues.apache.org/jira/browse/IGNITE-28097
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Mikhail Petrov
>            Assignee: Mikhail Petrov
>            Priority: Minor
>             Fix For: 2.19
>
>          Time Spent: 2h
>  Remaining Estimate: 0h
>
> Wee need to fix flaky CommunicationConnectionPoolMetricsTest see 
> https://ci2.ignite.apache.org/project.html?projectId=IgniteTests24Java8&testNameId=-8162810346300672703&tab=testDetails&branch_IgniteTests24Java8=__all_branches__
>  and tests with the same name but different parameters.
> Steps that lead to test hanging:
> 1. The cluster consists of server nodes (crd, srv) and one client node (cli). 
> srv is the router for cli.
> 2. srv is stopped, and cli is attempting to reconnect to another cluster node 
> (see org.apache.ignite.spi.discovery.tcp.ClientImpl.Reconnector).
> 3. During the reconnection process, cli is stopped. However, due to incorrect 
> exception handling, cli simply opens a socket to crd. 
> {code:java}
> org.apache.ignite.spi.IgniteSpiException: Wrong Ignite instance is set: null
> [16:24:39]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.spi.IgniteSpiAdapter$GridDummySpiContext.addTimeoutObject(IgniteSpiAdapter.java:958)
> [16:24:39]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.spi.IgniteSpiAdapter.addTimeoutObject(IgniteSpiAdapter.java:642)
> [16:24:39]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.startTimer(TcpDiscoverySpi.java:2451)
> [16:24:39]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.writeMessage(TcpDiscoverySpi.java:1756)
> [16:24:39]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.spi.discovery.tcp.TestTcpDiscoverySpi.writeMessage(TestTcpDiscoverySpi.java:62)
> [16:24:39]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.spi.discovery.tcp.ClientImpl.sendJoinRequest(ClientImpl.java:817)
> [16:24:39]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.spi.discovery.tcp.ClientImpl.sendJoinRequests(ClientImpl.java:646)
> [16:24:39]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.spi.discovery.tcp.ClientImpl.joinTopology(ClientImpl.java:608)
> [16:24:39]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.spi.discovery.tcp.ClientImpl$Reconnector.body(ClientImpl.java:1601)
> [16:24:39]W:           [org.apache.ignite:ignite-core]        at 
> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
> {code}
> As a result cli cannot also send a TcpDiscoveryNodeLeftMessage.
> 4. crd considers cli is reconnected and does not generate a NODE_LEFT event.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to