[
https://issues.apache.org/jira/browse/IGNITE-12069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pavel Pereslegin updated IGNITE-12069:
--------------------------------------
Description:
{{CacheSharedPreloader}} must do the following:
# build the map of partitions and corresponding supplier nodes from which
partitions will be loaded [1];
# switch cache data storage to {{no-op}} and back to original (HWM must be
fixed here for the needs of historical rebalance) under the checkpoint and keep
the partition update counter for each partition [1];
# run async the eviction indexes for the list of collected partitions (API
must be provided by IGNITE-11075) [2];
# send a request message to each node one by one with the list of partitions
to load [2];
# wait for files received (listening for the transmission handler) [2];
# run rebuild indexes async over the receiving partitions (API must be
provided by IGNITE-11075) [2];
# run historical rebalance from LWM to HWM collected above (LWM can be read
from the received file meta page) [1];
The points marked with the label {{[1]}} must be done prior to {{[2]}}.
NOTE. The following things need to be checked:
# Rebalancing of MVCC cache groups;
# How LWM and HWM will be set for the historical rebalance;
{noformat}
Stage 1.
Implement a datastore switch under a checkpoint write lock (CacheDataStore to
Noop and vice versa).
Tests:
- Switching under load.
- Check temp files cleanup on restart.
Check that with Noop storage on MOVING partition
- indexes are not updated
- update counter is valid
- tx/atomic updates on this aprtition works fine in cluster with Noop
store.
Stage 2.
Build Map for request partitions by node, add message that will be sent
to the supplier.
Send a demand request, handle the response, switch datastore when file
received.
Tests:
- Check partition consistency after receiving a file.
- File transmission under load.
- Failover - some of the partitions have been switched, the node has
been restarted,
rebalancing is expected to continue only for fully loaded large
partitions through the
historical rebalance, for the rest of partitions it should restart
from the beginning.
Stage 3.
Add WAL history reservation on supplier. Add historical rebalance
triggering
(LWM (partition) - HWM (Noop)) after switching from Noop datastore to
regular.
Tests:
- File rebalancing under load and without on atomic/tx caches.
(check existing PDS-enabled rebalancing tests).
- Ensure that MVCC groups use regular rebalancing.
- The rebalancing on the unstable topology and failures of the
supplier/demander nodes at different stages.
- (compatibility) The old nodes should use regular rebalancing.
Stage 4 (depends on IGNITE-11075)
Eviction and rebuild of indexes.
Tests:
- File rebalancing of caches with H2 indexes.
- Check consistency of H2 indexes.
{noformat}
was:
{{CacheSharedPreloader}} must do the following:
# build the map of partitions and corresponding supplier nodes from which
partitions will be loaded [1];
# switch cache data storage to {{no-op}} and back to original (HWM must be
fixed here for the needs of historical rebalance) under the checkpoint and keep
the partition update counter for each partition [1];
# run async the eviction indexes for the list of collected partitions (API
must be provided by IGNITE-11075) [2];
# send a request message to each node one by one with the list of partitions
to load [2];
# wait for files received (listening for the transmission handler) [2];
# run rebuild indexes async over the receiving partitions (API must be
provided by IGNITE-11075) [2];
# run historical rebalance from LWM to HWM collected above (LWM can be read
from the received file meta page) [1];
The points marked with the label {{[1]}} must be done prior to {{[2]}}.
NOTE. The following things need to be checked:
# Rebalancing of MVCC cache groups;
# How LWM and HWM will be set for the historical rebalance;
> Create cache shared preloader
> -----------------------------
>
> Key: IGNITE-12069
> URL: https://issues.apache.org/jira/browse/IGNITE-12069
> Project: Ignite
> Issue Type: Sub-task
> Reporter: Maxim Muzafarov
> Assignee: Pavel Pereslegin
> Priority: Major
> Labels: iep-28
>
> {{CacheSharedPreloader}} must do the following:
> # build the map of partitions and corresponding supplier nodes from which
> partitions will be loaded [1];
> # switch cache data storage to {{no-op}} and back to original (HWM must be
> fixed here for the needs of historical rebalance) under the checkpoint and
> keep the partition update counter for each partition [1];
> # run async the eviction indexes for the list of collected partitions (API
> must be provided by IGNITE-11075) [2];
> # send a request message to each node one by one with the list of partitions
> to load [2];
> # wait for files received (listening for the transmission handler) [2];
> # run rebuild indexes async over the receiving partitions (API must be
> provided by IGNITE-11075) [2];
> # run historical rebalance from LWM to HWM collected above (LWM can be read
> from the received file meta page) [1];
> The points marked with the label {{[1]}} must be done prior to {{[2]}}.
>
> NOTE. The following things need to be checked:
> # Rebalancing of MVCC cache groups;
> # How LWM and HWM will be set for the historical rebalance;
>
> {noformat}
> Stage 1.
> Implement a datastore switch under a checkpoint write lock (CacheDataStore
> to Noop and vice versa).
>
> Tests:
> - Switching under load.
> - Check temp files cleanup on restart.
> Check that with Noop storage on MOVING partition
> - indexes are not updated
> - update counter is valid
> - tx/atomic updates on this aprtition works fine in cluster with Noop
> store.
> Stage 2.
> Build Map for request partitions by node, add message that will be sent
> to the supplier.
> Send a demand request, handle the response, switch datastore when file
> received.
>
> Tests:
> - Check partition consistency after receiving a file.
> - File transmission under load.
> - Failover - some of the partitions have been switched, the node has
> been restarted,
> rebalancing is expected to continue only for fully loaded large
> partitions through the
> historical rebalance, for the rest of partitions it should restart
> from the beginning.
>
> Stage 3.
> Add WAL history reservation on supplier. Add historical rebalance
> triggering
> (LWM (partition) - HWM (Noop)) after switching from Noop datastore to
> regular.
>
> Tests:
> - File rebalancing under load and without on atomic/tx caches.
> (check existing PDS-enabled rebalancing tests).
> - Ensure that MVCC groups use regular rebalancing.
> - The rebalancing on the unstable topology and failures of the
> supplier/demander nodes at different stages.
> - (compatibility) The old nodes should use regular rebalancing.
>
> Stage 4 (depends on IGNITE-11075)
> Eviction and rebuild of indexes.
>
> Tests:
> - File rebalancing of caches with H2 indexes.
> - Check consistency of H2 indexes.
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.2#803003)