[ 
https://issues.apache.org/jira/browse/IGNITE-22377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Puchkovskiy updated IGNITE-22377:
---------------------------------------
    Summary: Choose node to fail on a refused handshake  (was: Choose node to 
fail on a failed handshake)

> Choose node to fail on a refused handshake
> ------------------------------------------
>
>                 Key: IGNITE-22377
>                 URL: https://issues.apache.org/jira/browse/IGNITE-22377
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Roman Puchkovskiy
>            Priority: Major
>              Labels: ignite-3
>             Fix For: 3.1
>
>
> Currently, if during a handshake a node gets refused because it's stale from 
> the point of view of the node to which it connects, the refused node notifies 
> its FailureHandler to force node restart.
> If a network partition happens, this might cause problems when it disappears: 
> nodesĀ  from different segments will start sniping each other. In the worst 
> case, a single segmented node might make the whole cluster (but itself) 
> restart if.
> It is suggested that the refusing node sends the following information about 
> the physical topology as it sees it to the refused node:
>  # Number of nodes in the PT
>  # Min ID of nodes in the PT
> The refused node will only restart if the number of nodes in the PT, as it 
> sees it, is less than the number of nodes in the PT of the refusing node; if 
> the sizes are equal, then comparing Min IDs of nodes in the PT will allow to 
> make a determenistic decision.
> This idea needs to be thought through and improved (or rejected).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to