[ https://issues.apache.org/jira/browse/IGNITE-24857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Denis Chudov updated IGNITE-24857: ---------------------------------- Epic Link: IGNITE-24900 > Possible race on assignments recovery when using Volatile Storage Profile > ------------------------------------------------------------------------- > > Key: IGNITE-24857 > URL: https://issues.apache.org/jira/browse/IGNITE-24857 > Project: Ignite > Issue Type: Bug > Reporter: Aleksandr Polovtsev > Priority: Major > Labels: ignite-3 > > The following error is printed in logs when running the > {{ItTableRaftSnapshotsTest#testDataRecoveryAfterSnapshot}} test with the > {{VolatilePageMemoryStorageEngine}}: > {code:java} > java.util.concurrent.CompletionException: java.lang.AssertionError: The > local node is outside of the replication group [, stable=Assignments > [nodes=HashSet [Assignment [consistentId=itrst_tdras_3344, isPeer=true], > Assignment [consistentId=itrst_tdras_3345, isPeer=true], Assignment > [consistentId=itrst_tdras_3346, isPeer=true]], force=false, > timestamp=114188891807809537, fromReset=false], pending=Assignments > [nodes=HashSet [Assignment [consistentId=itrst_tdras_3344, isPeer=true], > Assignment [consistentId=itrst_tdras_3345, isPeer=true], Assignment > [consistentId=itrst_tdras_3346, isPeer=true]], force=false, > timestamp=114188891807809537, fromReset=false], localName=itrst_tdras_3346]. > at > java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:315) > [?:?] > at > java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:320) > [?:?] > at > java.base/java.util.concurrent.CompletableFuture$UniAccept.tryFire$$$capture(CompletableFuture.java:722) > [?:?] > at > java.base/java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java) > [?:?] > at > java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:482) > [?:?] > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) > [?:?] > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) > [?:?] > at java.base/java.lang.Thread.run(Thread.java:840) [?:?] > Caused by: java.lang.AssertionError: The local node is outside of the > replication group [, stable=Assignments [nodes=HashSet [Assignment > [consistentId=itrst_tdras_3344, isPeer=true], Assignment > [consistentId=itrst_tdras_3345, isPeer=true], Assignment > [consistentId=itrst_tdras_3346, isPeer=true]], force=false, > timestamp=114188891807809537, fromReset=false], pending=Assignments > [nodes=HashSet [Assignment [consistentId=itrst_tdras_3344, isPeer=true], > Assignment [consistentId=itrst_tdras_3345, isPeer=true], Assignment > [consistentId=itrst_tdras_3346, isPeer=true]], force=false, > timestamp=114188891807809537, fromReset=false], localName=itrst_tdras_3346]. > at > org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$141(TableManager.java:2462) > ~[main/:?] > at > org.apache.ignite.internal.util.IgniteUtils.inBusyLock(IgniteUtils.java:905) > ~[main/:?] > at > org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$142(TableManager.java:2433) > ~[main/:?] > at > java.base/java.util.concurrent.CompletableFuture$UniAccept.tryFire$$$capture(CompletableFuture.java:718) > ~[?:?] > ... 5 more > {code} > After a brief investigation, looks like there's may be some kind of a race > between handling assignments from Meta Storage events and local assignments > recovery, which leads to the node being present in stable assignments but not > having started a corresponding replica. -- This message was sent by Atlassian Jira (v8.20.10#820010)