[
https://issues.apache.org/jira/browse/IGNITE-8529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506087#comment-16506087
]
ASF GitHub Bot commented on IGNITE-8529:
----------------------------------------
GitHub user alex-plekhanov opened a pull request:
https://github.com/apache/ignite/pull/4159
IGNITE-8529 Implement testing framework for checking WAL delta records
consistency
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/alex-plekhanov/ignite ignite-8529
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/ignite/pull/4159.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #4159
----
commit a5c142daf7c46a354d5417dac7cf7c3c79a9488b
Author: Aleksey Plekhanov <plehanov.alex@...>
Date: 2018-06-07T10:39:52Z
IGNITE-8529 Draft 3 WIP
commit 0ddd4d82c3625e45f21650267685bd2020997cb1
Author: Aleksey Plekhanov <plehanov.alex@...>
Date: 2018-06-07T12:25:42Z
IGNITE-8529 Draft 3 WIP
commit ada909a74d5b000ac741c07421da7f5bcc955023
Author: Aleksey Plekhanov <plehanov.alex@...>
Date: 2018-06-07T16:46:33Z
IGNITE-8529 Draft 3 WIP
commit 3f570c578b4946c6d599e9efbabf6260a45bce50
Author: Aleksey Plekhanov <plehanov.alex@...>
Date: 2018-06-07T16:51:08Z
IGNITE-8529 Draft 3 WIP
commit 883acf9447c2619799f6078523504082ada4dc21
Author: Aleksey Plekhanov <plehanov.alex@...>
Date: 2018-06-07T21:36:02Z
IGNITE-8529 Draft 2 WIP
commit 7cb3d90ff758e42ef7d876d17cb4d597fb0ee240
Author: Aleksey Plekhanov <plehanov.alex@...>
Date: 2018-06-08T07:46:42Z
IGNITE-8529 Draft 3 WIP
commit 41d2dc6a44c3a3775254f9d68595e04ba4198e98
Author: Aleksey Plekhanov <plehanov.alex@...>
Date: 2018-06-08T10:43:18Z
IGNITE-8529 Implement testing framework for checking WAL delta records
consistency
commit 4678f6a6b4c7a5922063f2118bb4810f5e2b6d12
Author: Aleksey Plekhanov <plehanov.alex@...>
Date: 2018-06-08T12:52:01Z
IGNITE-8529 Made page memory reusable after cache destroy.
commit c64719bf6be1562b0ad8f660eecf780cafca4334
Author: Aleksey Plekhanov <plehanov.alex@...>
Date: 2018-06-08T14:23:14Z
IGNITE-8529 Made page memory reusable after cache destroy (fix).
commit 755cae5c68ef472a56871a891095721aebe60ff0
Author: Aleksey Plekhanov <plehanov.alex@...>
Date: 2018-06-08T14:32:47Z
IGNITE-8529 Cleanup
----
> Implement testing framework for checking WAL delta records consistency
> ----------------------------------------------------------------------
>
> Key: IGNITE-8529
> URL: https://issues.apache.org/jira/browse/IGNITE-8529
> Project: Ignite
> Issue Type: New Feature
> Components: persistence
> Reporter: Ivan Rakov
> Assignee: Aleksey Plekhanov
> Priority: Major
> Fix For: 2.6
>
>
> We use sharp checkpointing of page memory in persistent mode. That implies
> that we write two types of records to write-ahead log: logical (e.g. data
> records) and phyisical (page snapshots + binary delta records). Physical
> records are applied only when node crashes/stops during ongoing checkpoint.
> We have the following invariant: checkpoint #(n-1) + all physical records =
> checkpoint #n.
> If correctness of physical records is broken, Ignite node may recover with
> incorrect page memory state, which in turn can bring unexpected delayed
> errors. However, consistency of physical records is poorly tested: only small
> part of our autotests perform node restarts, and even less part of them
> perform node stop when ongoing checkpoint is running.
> We should implement abstract test that:
> 1. Enforces checkpoint, freezes memory state at the moment of checkpoint.
> 2. Performs necessary test load.
> 3. Enforces checkpoint again, replays WAL and checks that page store at the
> moment of previous checkpoint with all applied physical records exactly
> equals to current checkpoint state.
> Except for checking correctness, test framework should do the following:
> 1. Gather statistics (like histogram) for types of wriiten physical records.
> That will help us to know what types of physical records are covered by test.
> 2. Visualize expected and actual page state (with all applied physical
> records) if incorrect page state is detected.
> Regarding implementation, I suppose we can use checkpoint listener mechanism
> to freeze page memory state at the moment of checkpoint.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)