[ 
https://issues.apache.org/jira/browse/IGNITE-13491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surkov Aleksandr updated IGNITE-13491:
--------------------------------------
    Description: 
Currently, logic in 
{{org.apache.ignite.internal.managers.discovery.GridDiscoveryManager#topologySnapshotMessage}}
 has major drawback, in condition we don't check that failed node with order 
less than oldest server node, is actually server node. So we can see invalid 
message about coordinator change, even though previous node was a client.

Reproducer:
 1. Start server #1
 2. Start client
 3. Start server #2
 4. Stop server #2 and client

We will see in logs of server #2 something like this:

{{[2020-09-29 10:41:25,909][INFO 
][disco-event-worker-#150%tcp.TcpDiscoverySpiMBeanTest2%|#150%tcp.TcpDiscoverySpiMBeanTest2%][GridDiscoveryManager]
 Coordinator changed [prev=TcpDiscoveryNode 
[id=371896fb-f612-4640-bfcd-cef6d2800001, 
consistentId=371896fb-f612-4640-bfcd-cef6d2800001, addrs=ArrayList [127.0.0.1], 
sockAddrs=HashSet [/127.0.0.1:0], discPort=0, order=2, intOrder=2, 
lastExchangeTime=1601365285287, loc=false, ver=2.10.0#20200929-sha1:00000000, 
*isClient=true*], cur=TcpDiscoveryNode 
[id=9d90f4b0-1374-4147-b7a7-d821f0000002, consistentId=127.0.0.1:47501, 
addrs=ArrayList [127.0.0.1], sockAddrs=HashSet [/127.0.0.1:47501], 
discPort=47501, order=3, intOrder=3, lastExchangeTime=1601365285900, loc=true, 
ver=2.10.0#20200929-sha1:00000000, isClient=false]]}}

  was:
Currently, logic in 
{{org.apache.ignite.internal.managers.discovery.GridDiscoveryManager#topologySnapshotMessage}}
 has major drawback, in condition we don't check that failed node with order 
less than oldest server node, is actually server node. So we can see invalid 
message about coordinator change, even though previous node was a client.

Reproducer:
 1. Start server #1
 2. Start client
 3. Start server #1
 4. Stop server #2 and client

We will see in logs of server #2 something like this:

{{[2020-09-29 10:41:25,909][INFO 
][disco-event-worker-#150%tcp.TcpDiscoverySpiMBeanTest2%|#150%tcp.TcpDiscoverySpiMBeanTest2%][GridDiscoveryManager]
 Coordinator changed [prev=TcpDiscoveryNode 
[id=371896fb-f612-4640-bfcd-cef6d2800001, 
consistentId=371896fb-f612-4640-bfcd-cef6d2800001, addrs=ArrayList [127.0.0.1], 
sockAddrs=HashSet [/127.0.0.1:0], discPort=0, order=2, intOrder=2, 
lastExchangeTime=1601365285287, loc=false, ver=2.10.0#20200929-sha1:00000000, 
*isClient=true*], cur=TcpDiscoveryNode 
[id=9d90f4b0-1374-4147-b7a7-d821f0000002, consistentId=127.0.0.1:47501, 
addrs=ArrayList [127.0.0.1], sockAddrs=HashSet [/127.0.0.1:47501], 
discPort=47501, order=3, intOrder=3, lastExchangeTime=1601365285900, loc=true, 
ver=2.10.0#20200929-sha1:00000000, isClient=false]]}}


> Fix incorrect topology snapshot logger output about coordinator change.
> -----------------------------------------------------------------------
>
>                 Key: IGNITE-13491
>                 URL: https://issues.apache.org/jira/browse/IGNITE-13491
>             Project: Ignite
>          Issue Type: Bug
>    Affects Versions: 2.9
>            Reporter: Ivan Daschinskiy
>            Assignee: Surkov Aleksandr
>            Priority: Minor
>              Labels: newbie
>             Fix For: 2.10
>
>
> Currently, logic in 
> {{org.apache.ignite.internal.managers.discovery.GridDiscoveryManager#topologySnapshotMessage}}
>  has major drawback, in condition we don't check that failed node with order 
> less than oldest server node, is actually server node. So we can see invalid 
> message about coordinator change, even though previous node was a client.
> Reproducer:
>  1. Start server #1
>  2. Start client
>  3. Start server #2
>  4. Stop server #2 and client
> We will see in logs of server #2 something like this:
> {{[2020-09-29 10:41:25,909][INFO 
> ][disco-event-worker-#150%tcp.TcpDiscoverySpiMBeanTest2%|#150%tcp.TcpDiscoverySpiMBeanTest2%][GridDiscoveryManager]
>  Coordinator changed [prev=TcpDiscoveryNode 
> [id=371896fb-f612-4640-bfcd-cef6d2800001, 
> consistentId=371896fb-f612-4640-bfcd-cef6d2800001, addrs=ArrayList 
> [127.0.0.1], sockAddrs=HashSet [/127.0.0.1:0], discPort=0, order=2, 
> intOrder=2, lastExchangeTime=1601365285287, loc=false, 
> ver=2.10.0#20200929-sha1:00000000, *isClient=true*], cur=TcpDiscoveryNode 
> [id=9d90f4b0-1374-4147-b7a7-d821f0000002, consistentId=127.0.0.1:47501, 
> addrs=ArrayList [127.0.0.1], sockAddrs=HashSet [/127.0.0.1:47501], 
> discPort=47501, order=3, intOrder=3, lastExchangeTime=1601365285900, 
> loc=true, ver=2.10.0#20200929-sha1:00000000, isClient=false]]}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to