[
https://issues.apache.org/jira/browse/IGNITE-9803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Semen Boikov updated IGNITE-9803:
---------------------------------
Description:
Debugged failure of
DynamicIndexPartitionedTransactionalConcurrentSelfTest.testConcurrentRebalance
with GridDhtInvalidPartitionException, here is scenario where this error occurs:
* test starts node1, node2, loads data
* node3 is started, one partition is assigned to [node2, node3] and node3
starts rebalancing
* node4 is started, partition is re-assigned to [node2, node4]
* at this time rebalancing on node3 is in progress, is is going to handles
supply message and at this moment exchange thread moves partition to RENTING
state, at this moment it can not be moved to EVICTED since async partition
cleanup is needed
* at node3 thread doing rebalancing sees RENTING partition and gets
GridDhtInvalidPartitionException
Probability of such failure is very high if insert sleep(5000) in the code
doing async partition cleanup (PartitionEvictionTask.run).
I think fix for this issue is just handle GridDhtInvalidPartitionException in
GridDhtPartitionDemander.
was:
Debugged failure of
DynamicIndexPartitionedTransactionalConcurrentSelfTest.testConcurrentRebalance
with GridDhtInvalidPartitionException, here is scenario where this error occurs:
* test starts node1, node2, loads data
* node3 is started, one partition is assigned to node2, node3 and node3 starts
rebalancing
* node4 is started, partition is re-assigned to node2 and node4
* at this time rebalancing on node3 is in progress, is is going to handles
supply message and at this moment exchange thread moves partition to RENTING
state, at this moment it can not be moved to EVICTED since async partition
cleanup is needed
* at node3 thread doing rebalancing sees RENTING partition and gets
GridDhtInvalidPartitionException
Probability of such failure is very high if insert sleep(5000) in the code
doing async partition cleanup (PartitionEvictionTask.run).
I think fix for this issue is just handle GridDhtInvalidPartitionException in
GridDhtPartitionDemander.
> GridDhtInvalidPartitionException in GridDhtPartitionDemander
> ------------------------------------------------------------
>
> Key: IGNITE-9803
> URL: https://issues.apache.org/jira/browse/IGNITE-9803
> Project: Ignite
> Issue Type: Bug
> Components: cache
> Reporter: Semen Boikov
> Assignee: Semen Boikov
> Priority: Major
> Fix For: 2.8
>
>
> Debugged failure of
> DynamicIndexPartitionedTransactionalConcurrentSelfTest.testConcurrentRebalance
> with GridDhtInvalidPartitionException, here is scenario where this error
> occurs:
> * test starts node1, node2, loads data
> * node3 is started, one partition is assigned to [node2, node3] and node3
> starts rebalancing
> * node4 is started, partition is re-assigned to [node2, node4]
> * at this time rebalancing on node3 is in progress, is is going to handles
> supply message and at this moment exchange thread moves partition to RENTING
> state, at this moment it can not be moved to EVICTED since async partition
> cleanup is needed
> * at node3 thread doing rebalancing sees RENTING partition and gets
> GridDhtInvalidPartitionException
> Probability of such failure is very high if insert sleep(5000) in the code
> doing async partition cleanup (PartitionEvictionTask.run).
>
> I think fix for this issue is just handle GridDhtInvalidPartitionException in
> GridDhtPartitionDemander.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)