[ https://issues.apache.org/jira/browse/IGNITE-19692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sergey Chugunov updated IGNITE-19692: ------------------------------------- Epic Link: IGNITE-18733 > Design Resilient Distributed Operations mechanism > ------------------------------------------------- > > Key: IGNITE-19692 > URL: https://issues.apache.org/jira/browse/IGNITE-19692 > Project: Ignite > Issue Type: Task > Reporter: Roman Puchkovskiy > Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > > We need a mechanism that would allow to do the following: > # Execute an operation on all (or some of) partitions of a table > # The whole operation is split into sub-operations (each of which operate on > a single partition) > # Each sub-operation must be resilient: that is, if the node that hosts it > restarts or the partition moves to another node, the operation should proceed > # When a sub-operation ends, it notifies the operation tracker/coordinator > # When all sub-operations end, the tracker might take some action (like > starting a subsequent operation) > # The tracker is also resilient > We need such a mechanism in a few places in the system: > # Transaction cleanup? > # Index build > # Table data validation as a part of a schema change that requires a > validation (like a narrowing type change) > Probably, more applications of the mechanism will emerge. > > On the possible implementation: the tracker could be collocated with table's > primary replica (that would guarantee that at most one tracker exists at all > times). We could store the data needed to track the operation in the > Meta-Storage under a prefix corresponding to the table, like > 'ops.<tableId>.<opType>.<opKey>'. We could store the completion status for > each of the partitions there along with some operation-wide status. -- This message was sent by Atlassian Jira (v8.20.10#820010)