[jira] [Commented] (IGNITE-27268) Test ItDisasterRecoveryManagerTest.testRestartPartitionsWithCleanUpConcurrentRebalance become flaky

Mirza Aliev (Jira) Fri, 12 Dec 2025 00:44:06 -0800


    [ 
https://issues.apache.org/jira/browse/IGNITE-27268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18044618#comment-18044618
 ]


Mirza Aliev commented on IGNITE-27268:
--------------------------------------

What was done: 
* fix "Not enough alive nodes to perform reset with cleanup" by changing test 
itself, alter to 5 replicas was moved before 5th node start, so we unsure that 
there won't be preliminary rebalance to [0,1,2,5] and the situation when 5 node 
is still INITIALIZING its partition when  reset with cleanup already started.

* "The local node is outside of the replication group" is fixed by removing 
skipping starting of replica when node not in stable when 
`PartitionReplicaLifecycleManager#restartPartitionWithCleanUp` called, it is 
possible in the middle of rebalance when Stable Switch not yet invoked but 
partition restart cleanup is invoked 

> Test 
> ItDisasterRecoveryManagerTest.testRestartPartitionsWithCleanUpConcurrentRebalance
>  become flaky
> ---------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-27268
>                 URL: https://issues.apache.org/jira/browse/IGNITE-27268
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Mirza Aliev
>            Assignee: Mirza Aliev
>            Priority: Major
>              Labels: MakeTeamcityGreenAgain, ignite-3
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> {noformat}
> java.lang.AssertionError: java.util.concurrent.ExecutionException: 
> org.apache.ignite.internal.table.distributed.disaster.exceptions.NotEnoughAliveNodesException:
>  IGN-RECOVERY-5 Not enough alive nodes to perform reset with clean up. 
> TraceId:b1ae1f1d
>   at 
> org.apache.ignite.internal.testframework.matchers.CompletableFutureMatcher.matchesSafely(CompletableFutureMatcher.java:78)
>   at 
> org.apache.ignite.internal.testframework.matchers.CompletableFutureMatcher.matchesSafely(CompletableFutureMatcher.java:35)
>   at org.hamcrest.TypeSafeMatcher.matches(TypeSafeMatcher.java:83)
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:10)
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
>   at 
> org.apache.ignite.internal.disaster.ItDisasterRecoveryManagerTest.testRestartPartitionsWithCleanUpConcurrentRebalance(ItDisasterRecoveryManagerTest.java:743)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
>   at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
> Caused by: java.util.concurrent.ExecutionException: 
> org.apache.ignite.internal.table.distributed.disaster.exceptions.NotEnoughAliveNodesException:
>  IGN-RECOVERY-5 Not enough alive nodes to perform reset with clean up. 
> TraceId:b1ae1f1d
>   at 
> java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
>   at 
> java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2024)
>   at 
> org.apache.ignite.internal.testframework.matchers.CompletableFutureMatcher.matchesSafely(CompletableFutureMatcher.java:74)
>   ... 8 more
> {noformat}
> https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3x_Test_IntegrationTests_Transactions/9755057



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (IGNITE-27268) Test ItDisasterRecoveryManagerTest.testRestartPartitionsWithCleanUpConcurrentRebalance become flaky

Reply via email to