[jira] [Updated] (IGNITE-18694) Recovery for DistributionZoneRebalanceEngine#metaStorageManager on DistributionZoneManager#start()

Mirza Aliev (Jira) Thu, 15 Jun 2023 05:37:05 -0700


     [ 
https://issues.apache.org/jira/browse/IGNITE-18694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Mirza Aliev updated IGNITE-18694:
---------------------------------
    Labels: ignite-3 tech-debt  (was: ignite-3)

> Recovery for DistributionZoneRebalanceEngine#metaStorageManager on 
> DistributionZoneManager#start()
> --------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-18694
>                 URL: https://issues.apache.org/jira/browse/IGNITE-18694
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Sergey Uttsel
>            Priority: Major
>              Labels: ignite-3, tech-debt
>
> h3. *Motivation*
> DistributionZoneRebalanceEngine#dataNodesListener processes events with 
> zones' data nodes updates and invokes 
> RebalanceUtil#updatePendingAssignmentsKeys with new data nodes value. 
> updatePendingAssignmentsKeys does async metaStorageMgr#invoke. It's possible 
> that dataNodesListener processed data nodes event then the nodes crashed 
> without updating assignments in metastorage.
> h3. *Implementation Notes*
> To fix it we can redo all logic from 
> `createDistributionZonesDataNodesListener()`. On 
> DistributionZoneManager#start we need to read from `vault` data nodes for all 
> zones and invoke `updatePendingAssignmentsKeys` for all tables with 
> `metaStorageManager.appliedRevision()`. If the last event with data nodes has 
> not updated pending assignments then assignments will be updated. If the last 
> event with data nodes have updated pending assignments then an update 
> assignments will be invoked another time but it will not update assignments, 
> because there is a check for case when new assignments equals to old one.
> h3. *Definition of Done*
> Created a recovery for this case.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-18694) Recovery for DistributionZoneRebalanceEngine#metaStorageManager on DistributionZoneManager#start()

Reply via email to