[
https://issues.apache.org/jira/browse/IGNITE-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576302#comment-16576302
]
Pavel Vinokurov commented on IGNITE-9178:
-----------------------------------------
[~agoncharuk]
_leftNode2Part _ contains partitions for left nodes. The partition lost event
raised if _leftNode2Part_ contains nodes missed in _node2Part_.
_node2Part_ map is cleaned up in two places
_GridDhtPartitionTopologyImpl#update_ and
_GridDhtPartitionTopologyImpl#removeNode_ methods
Current patch fixes the following situation. Two nodes have left cluster
simultaneously. During exchange for the first left node, the coordinator sends
full map to other nodes.
_GridDhtPartitionTopologyImpl#update _handles full map and removes partitions
for the second left node without adding to _leftNode2Part_.
On the next exchange for the second node, node2part already hasn't partitions
for second node, so partitions are not added to _leftNode2Part_ in the
_removeNode()_ .
This patch does not affect to _diffFromAffinity_ map anyhow.
There is an another possible patch - cleanup node2Part map only in
detectLostPartitions() method for all left nodes. But I am not sure that it
doesn't broke logic related to _diffFromAffinity_ map in
_GridDhtPartitionTopologyImpl#update_ method. Please let me know if this patch
would be more appropriate.
> Partition lost event are not triggered if multiple nodes left cluster
> ---------------------------------------------------------------------
>
> Key: IGNITE-9178
> URL: https://issues.apache.org/jira/browse/IGNITE-9178
> Project: Ignite
> Issue Type: Bug
> Components: cache
> Affects Versions: 2.4
> Reporter: Pavel Vinokurov
> Assignee: Pavel Vinokurov
> Priority: Blocker
> Fix For: 2.7
>
>
> If multiple nodes left cluster simultaneously, left partitions are removed
> from GridDhtPartitionTopologyImpl#node2part without adding to leftNode2Part
> in GridDhtPartitionTopologyImpl#update method.
> Thus GridDhtPartitionTopologyImpl#detectLostPartitions can't detect lost
> partitions
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)