[ 
https://issues.apache.org/jira/browse/IGNITE-19288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mirza Aliev updated IGNITE-19288:
---------------------------------
    Description: 
h3. Motivation

If new logical topology has a new nodes and nodes that left cluster then 
DistributionZoneManager#scheduleTimers() schedules saveDataNodesOnScaleUp and 
saveDataNodesOnScaleDown. These tasks are invoked asynchronously but use the 
same entry in topologyAugmentationMap. So scale up puts entry with some 
revision and then scale down puts entry with the same revision as key.
The issue is reproduced by 
DistributionZoneAwaitDataNodesTest#testSeveralScaleUpAndSeveralScaleDownThenScaleUpAndScaleDown
h3. Definition of Done
 * Concurrency bug is fixed.
 * Test is enabled.

UPD: 

The problem in general could be reproducible in very rare case, namely in the 
scenario, when we have received {{LogicalTopologyEventListener#onTopologyLeap}} 
and there were added and removed nodes in this Topology comparing with the 
topology from metastorage.

The solution is to change representation of the 
{{DistributionZoneManager.ZoneState#topologyAugmentationMap}}. 

We have 
{code:java}
    private static class Augmentation {
        /** Names of the node. */
        Set<NodeWithAttributes> nodes;

        /** Flag that indicates whether {@code nodeNames} should be added or 
removed. */
        boolean addition;

        Augmentation(Set<NodeWithAttributes> nodes, boolean addition) {
            this.nodes = nodes;
            this.addition = addition;
        }
    }
{code}

I suggest to store flag addition in the {{NodeWithAttributes}}, so we could 
have different types of node in terms of added or removed node for a revision 
in the {{DistributionZoneManager.ZoneState#topologyAugmentationMap}}.


  was:
h3. Motivation

If new logical topology has a new nodes and nodes that left cluster then 
DistributionZoneManager#scheduleTimers() schedules saveDataNodesOnScaleUp and 
saveDataNodesOnScaleDown. These tasks are invoked asynchronously but use the 
same entry in topologyAugmentationMap. So scale up puts entry with some 
revision and then scale down puts entry with the same revision as key.
The issue is reproduced by 
DistributionZoneAwaitDataNodesTest#testSeveralScaleUpAndSeveralScaleDownThenScaleUpAndScaleDown
h3. Definition of Done
 * Concurrency bug is fixed.
 * Test is enabled.


> A race on scheduling data nodes updates if there new nodes and stopped nodes 
> in logical topology
> ------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-19288
>                 URL: https://issues.apache.org/jira/browse/IGNITE-19288
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Sergey Uttsel
>            Assignee: Mirza Aliev
>            Priority: Major
>              Labels: ignite-3
>
> h3. Motivation
> If new logical topology has a new nodes and nodes that left cluster then 
> DistributionZoneManager#scheduleTimers() schedules saveDataNodesOnScaleUp and 
> saveDataNodesOnScaleDown. These tasks are invoked asynchronously but use the 
> same entry in topologyAugmentationMap. So scale up puts entry with some 
> revision and then scale down puts entry with the same revision as key.
> The issue is reproduced by 
> DistributionZoneAwaitDataNodesTest#testSeveralScaleUpAndSeveralScaleDownThenScaleUpAndScaleDown
> h3. Definition of Done
>  * Concurrency bug is fixed.
>  * Test is enabled.
> UPD: 
> The problem in general could be reproducible in very rare case, namely in the 
> scenario, when we have received 
> {{LogicalTopologyEventListener#onTopologyLeap}} and there were added and 
> removed nodes in this Topology comparing with the topology from metastorage.
> The solution is to change representation of the 
> {{DistributionZoneManager.ZoneState#topologyAugmentationMap}}. 
> We have 
> {code:java}
>     private static class Augmentation {
>         /** Names of the node. */
>         Set<NodeWithAttributes> nodes;
>         /** Flag that indicates whether {@code nodeNames} should be added or 
> removed. */
>         boolean addition;
>         Augmentation(Set<NodeWithAttributes> nodes, boolean addition) {
>             this.nodes = nodes;
>             this.addition = addition;
>         }
>     }
> {code}
> I suggest to store flag addition in the {{NodeWithAttributes}}, so we could 
> have different types of node in terms of added or removed node for a revision 
> in the {{DistributionZoneManager.ZoneState#topologyAugmentationMap}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to