[
https://issues.apache.org/jira/browse/IGNITE-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456279#comment-16456279
]
Alexey Kuznetsov edited comment on IGNITE-4210 at 4/27/18 12:12 PM:
--------------------------------------------------------------------
The root cause of the bug identified.
grid.cache(DEFAULT_CACHE_NAME).loadCache(null) performs cache loading only on
few nodes(ususally one) because other nodes are in the middle of process of
joining cluster.
In unstable topology(multiple nodes join cluster) some entries aren't get
loaded into the cache , because partitions cannot be reserved. Partitons
concurrently are evicted, moved to other nodes while PME.
I put cache lock on topology before
grid.cache(DEFAULT_CACHE_NAME).loadCache(null), and unlocked it after loading.
Test passes.
So, we should lock topology before cache loading, or retry loading after
topology is settled down.
was (Author: alexey kuznetsov):
The root cause of the bug identified.
grid.cache(DEFAULT_CACHE_NAME).loadCache(null) performs cache loading only on
few nodes(ususally one) because other nodes are in the middle of process of
joining cluster.
In unstable topology(multiple nodes join cluster) some entries aren't get
loaded into the cache , because partitions cannot be reserved. Partitons
concurrently are evicted and moved to other nodes while PME.
I put cache lock on topology before
grid.cache(DEFAULT_CACHE_NAME).loadCache(null), and unlocked it after loading.
Test passes.
So, we should lock topology before cache loading, or retry loading after
topology is settled down.
> CacheLoadingConcurrentGridStartSelfTest.testLoadCacheFromStore() test lose
> data.
> --------------------------------------------------------------------------------
>
> Key: IGNITE-4210
> URL: https://issues.apache.org/jira/browse/IGNITE-4210
> Project: Ignite
> Issue Type: Bug
> Reporter: Anton Vinogradov
> Assignee: Alexey Kuznetsov
> Priority: Major
> Labels: MakeTeamcityGreenAgain
>
> org.apache.ignite.internal.processors.cache.distributed.CacheLoadingConcurrentGridStartSelfTest#testLoadCacheFromStore
> sometimes have failures.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)