Re: Crash recovery speed-up #3, Cellular Switch

Anton Vinogradov Thu, 14 May 2020 00:54:31 -0700

Folks,

It seems, we have tacit agreement here.
Going to merge fix May 15.


On Tue, May 12, 2020 at 10:08 AM Anton Vinogradov <a...@apache.org> wrote:

> Denis,
>
> Rebalance is not expected here since this optimization works only on a
> fully rebalanced cluster with baseline.
>
> On Sat, May 9, 2020 at 12:48 AM Denis Magda <dma...@apache.org> wrote:
>
>> Hi Anton,
>>
>> Generally, it means that Ignite will keep executing
>> operations/transactions
>> that are mapped into the partitions of those cells that won't be
>> rebalanced, is that correct?
>>
>> -
>> Denis
>>
>>
>> On Wed, May 6, 2020 at 3:24 AM Anton Vinogradov <a...@apache.org> wrote:
>>
>> > Igniters,
>> >
>> > PME-free switch [1] (since 2.8) skips PME on node left when possible
>> > (baseline + fully rebalanced cluster).
>> > This means we already wait for nothing (except recovery) to perform the
>> > switch.
>> > This optimization allows continuing already started operations during or
>> > after the switch if they are not affected by failed primary.
>> > But upcoming operations still can't be started until the switch is
>> finished
>> > cluster-wide.
>> >
>> > Let me propose an additional optimization - Cellular switch.
>> > Cellular Affinity [2] means that nodes combined into virtual cells
>> where,
>> > for each partition, backups located at the same cell with primaries.
>> > The simplest way to gain Cellular Affinity is to use backup filters [3].
>> >
>> > Cellular Affinity allows to finish the switch outside the affected cell
>> > instantly with the following assumptions:
>> > - Replicated caches should be recovered first since every node affected
>> (as
>> > a backup) by any failed primary.
>> >   But, it is expected that replicated caches effectively read-only (has
>> > extremely rare updates), so, nothing to wait here.
>> > - Upcoming replicated transactions (with non-failed primaries) can be
>> > started but can't be committed until switch finished cluster-wide.
>> > - Upcoming transactions related to the broken cell will wait for cell
>> > recovery (cluster-wide switch finish).
>> >
>> > ... and this means:
>> > In addition to PME-free switch, where we able to continue already
>> started
>> > operations during or after the switch, now we also able to perform most
>> of
>> > the upcoming operations during the switch.
>> >
>> > In other words, Cellular switch has little effect on the operation's
>> > latency, when operation not related to the failed cell.
>> >
>> > According to benchmark [4] which checks "how fast upcoming transactions
>> > (started after switch start) can be committed when we have thousands of
>> > prepared transactions (prepared before switch start)", we have 5326 ms
>> [5]
>> > operation's latency on master and 65 ms [6] with the proposed fix,
>> which is
>> > ~100 times faster.
>> >
>> > Fix [7] (as a part of IEP-45 [8]) ready to be reviewed.
>> > Waiting for your review!
>> >
>> >
>> > [1]
>> >
>> >
>> http://apache-ignite-developers.2346864.n4.nabble.com/Non-blocking-PME-Phase-One-Node-fail-tp43531p44586.html
>> > [2]
>> >
>> >
>> https://cwiki.apache.org/confluence/display/IGNITE/IEP-45%3A+Crash+Recovery+Speed-Up#IEP-45:CrashRecoverySpeed-Up-Cellularswitch
>> > [3]
>> >
>> >
>> https://gist.github.com/anton-vinogradov/c50f9d0ce3e3e2997646f84ba7eba5f5#file-bench-java-L417
>> > [4]
>> >
>> https://gist.github.com/anton-vinogradov/c50f9d0ce3e3e2997646f84ba7eba5f5
>> > [5]
>> >
>> >
>> https://gist.github.com/anton-vinogradov/a35a3a8151b7494aa84b83f58cb75889#file-master-txt-L15
>> > [6]
>> >
>> >
>> https://gist.github.com/anton-vinogradov/a35a3a8151b7494aa84b83f58cb75889#file-fix-txt-L15
>> > [7] https://issues.apache.org/jira/browse/IGNITE-12617
>> > [8]
>> >
>> >
>> https://cwiki.apache.org/confluence/display/IGNITE/IEP-45%3A+Crash+Recovery+Speed-Up
>> >
>>
>

Re: Crash recovery speed-up #3, Cellular Switch

Reply via email to