[
https://issues.apache.org/jira/browse/IGNITE-26532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vladislav Pyatkov resolved IGNITE-26532.
----------------------------------------
Resolution: Fixed
> Design CMG/MG absence handling logic
> ------------------------------------
>
> Key: IGNITE-26532
> URL: https://issues.apache.org/jira/browse/IGNITE-26532
> Project: Ignite
> Issue Type: Task
> Reporter: Alexander Lapin
> Assignee: Vladislav Pyatkov
> Priority: Major
> Labels: ignite-3
>
> h3. Motivation
> In case of
> # loss of majority in *MG* only
> # loss of majority in *CMG* only
> # loss of majority in both *CMG* and *MG*
> User operations behave adequately: within the specified timeouts they attempt
> to wait for majority restoration, and if it does not happen, they fail with a
> clear error. At the same time, they do not flood the logs with tons of
> exceptions on every internal retry.
> We are talking about operations such as:
> * Schema changes (e.g., creating a table).
> * Transactions of all types (with partially applied transactions being
> rolled back).
> * Adding nodes.
> * Various {{{}resetPartitions{}}}.
> * …
> At the same time, user operations such as
> * stopping a node, and
> * read-only transactions (as in the past)
> must complete successfully without exceptions being logged.
> Internal _system_ operations must wait indefinitely for the restoration of
> majority in the corresponding system groups (whether via infinite retry or
> reactively), and under no circumstances should they trigger FG (which is what
> happens now).
> A node should log reasonably little about the unavailability of a system
> group, not as excessively as it currently does.
> Cancellation operations (rollback, abort, etc.) should, whenever possible,
> work even in the absence of CMG/MG. This needs to be verified separately,
> since it’s unclear if we can guarantee it for everything.
> When CMG/MG is restored, the cluster should return to normal operability.
> h3. Definition of Done
> Design document that addresses aforementioned questions is ready.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)