[ 
https://issues.apache.org/jira/browse/IGNITE-24857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denis Chudov updated IGNITE-24857:
----------------------------------
    Epic Link: IGNITE-24900

> Possible race on assignments recovery when using Volatile Storage Profile
> -------------------------------------------------------------------------
>
>                 Key: IGNITE-24857
>                 URL: https://issues.apache.org/jira/browse/IGNITE-24857
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Aleksandr Polovtsev
>            Priority: Major
>              Labels: ignite-3
>
> The following error is printed in logs when running the 
> {{ItTableRaftSnapshotsTest#testDataRecoveryAfterSnapshot}} test with the 
> {{VolatilePageMemoryStorageEngine}}:
> {code:java}
>  java.util.concurrent.CompletionException: java.lang.AssertionError: The 
> local node is outside of the replication group [, stable=Assignments 
> [nodes=HashSet [Assignment [consistentId=itrst_tdras_3344, isPeer=true], 
> Assignment [consistentId=itrst_tdras_3345, isPeer=true], Assignment 
> [consistentId=itrst_tdras_3346, isPeer=true]], force=false, 
> timestamp=114188891807809537, fromReset=false], pending=Assignments 
> [nodes=HashSet [Assignment [consistentId=itrst_tdras_3344, isPeer=true], 
> Assignment [consistentId=itrst_tdras_3345, isPeer=true], Assignment 
> [consistentId=itrst_tdras_3346, isPeer=true]], force=false, 
> timestamp=114188891807809537, fromReset=false], localName=itrst_tdras_3346].
>       at 
> java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:315)
>  [?:?]
>       at 
> java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:320)
>  [?:?]
>       at 
> java.base/java.util.concurrent.CompletableFuture$UniAccept.tryFire$$$capture(CompletableFuture.java:722)
>  [?:?]
>       at 
> java.base/java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java)
>  [?:?]
>       at 
> java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:482)
>  [?:?]
>       at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
>  [?:?]
>       at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>  [?:?]
>       at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
> Caused by: java.lang.AssertionError: The local node is outside of the 
> replication group [, stable=Assignments [nodes=HashSet [Assignment 
> [consistentId=itrst_tdras_3344, isPeer=true], Assignment 
> [consistentId=itrst_tdras_3345, isPeer=true], Assignment 
> [consistentId=itrst_tdras_3346, isPeer=true]], force=false, 
> timestamp=114188891807809537, fromReset=false], pending=Assignments 
> [nodes=HashSet [Assignment [consistentId=itrst_tdras_3344, isPeer=true], 
> Assignment [consistentId=itrst_tdras_3345, isPeer=true], Assignment 
> [consistentId=itrst_tdras_3346, isPeer=true]], force=false, 
> timestamp=114188891807809537, fromReset=false], localName=itrst_tdras_3346].
>       at 
> org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$141(TableManager.java:2462)
>  ~[main/:?]
>       at 
> org.apache.ignite.internal.util.IgniteUtils.inBusyLock(IgniteUtils.java:905) 
> ~[main/:?]
>       at 
> org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$142(TableManager.java:2433)
>  ~[main/:?]
>       at 
> java.base/java.util.concurrent.CompletableFuture$UniAccept.tryFire$$$capture(CompletableFuture.java:718)
>  ~[?:?]
>       ... 5 more
> {code}
> After a brief investigation, looks like there's may be some kind of a race 
> between handling assignments from Meta Storage events and local assignments 
> recovery, which leads to the node being present in stable assignments but not 
> having started a corresponding replica.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to