Hi Anton,

Generally, it means that Ignite will keep executing operations/transactions
that are mapped into the partitions of those cells that won't be
rebalanced, is that correct?

-
Denis


On Wed, May 6, 2020 at 3:24 AM Anton Vinogradov <a...@apache.org> wrote:

> Igniters,
>
> PME-free switch [1] (since 2.8) skips PME on node left when possible
> (baseline + fully rebalanced cluster).
> This means we already wait for nothing (except recovery) to perform the
> switch.
> This optimization allows continuing already started operations during or
> after the switch if they are not affected by failed primary.
> But upcoming operations still can't be started until the switch is finished
> cluster-wide.
>
> Let me propose an additional optimization - Cellular switch.
> Cellular Affinity [2] means that nodes combined into virtual cells where,
> for each partition, backups located at the same cell with primaries.
> The simplest way to gain Cellular Affinity is to use backup filters [3].
>
> Cellular Affinity allows to finish the switch outside the affected cell
> instantly with the following assumptions:
> - Replicated caches should be recovered first since every node affected (as
> a backup) by any failed primary.
>   But, it is expected that replicated caches effectively read-only (has
> extremely rare updates), so, nothing to wait here.
> - Upcoming replicated transactions (with non-failed primaries) can be
> started but can't be committed until switch finished cluster-wide.
> - Upcoming transactions related to the broken cell will wait for cell
> recovery (cluster-wide switch finish).
>
> ... and this means:
> In addition to PME-free switch, where we able to continue already started
> operations during or after the switch, now we also able to perform most of
> the upcoming operations during the switch.
>
> In other words, Cellular switch has little effect on the operation's
> latency, when operation not related to the failed cell.
>
> According to benchmark [4] which checks "how fast upcoming transactions
> (started after switch start) can be committed when we have thousands of
> prepared transactions (prepared before switch start)", we have 5326 ms [5]
> operation's latency on master and 65 ms [6] with the proposed fix, which is
> ~100 times faster.
>
> Fix [7] (as a part of IEP-45 [8]) ready to be reviewed.
> Waiting for your review!
>
>
> [1]
>
> http://apache-ignite-developers.2346864.n4.nabble.com/Non-blocking-PME-Phase-One-Node-fail-tp43531p44586.html
> [2]
>
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-45%3A+Crash+Recovery+Speed-Up#IEP-45:CrashRecoverySpeed-Up-Cellularswitch
> [3]
>
> https://gist.github.com/anton-vinogradov/c50f9d0ce3e3e2997646f84ba7eba5f5#file-bench-java-L417
> [4]
> https://gist.github.com/anton-vinogradov/c50f9d0ce3e3e2997646f84ba7eba5f5
> [5]
>
> https://gist.github.com/anton-vinogradov/a35a3a8151b7494aa84b83f58cb75889#file-master-txt-L15
> [6]
>
> https://gist.github.com/anton-vinogradov/a35a3a8151b7494aa84b83f58cb75889#file-fix-txt-L15
> [7] https://issues.apache.org/jira/browse/IGNITE-12617
> [8]
>
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-45%3A+Crash+Recovery+Speed-Up
>

Reply via email to