[
https://issues.apache.org/jira/browse/IGNITE-18737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17691673#comment-17691673
]
Mirza Aliev commented on IGNITE-18737:
--------------------------------------
[~Sergey Uttsel] LGTM, thanks!
> Tests from ItIgniteNodeRestartTest became flaky
> --------------------------------------------------
>
> Key: IGNITE-18737
> URL: https://issues.apache.org/jira/browse/IGNITE-18737
> Project: Ignite
> Issue Type: Bug
> Reporter: Mirza Aliev
> Assignee: Sergey Uttsel
> Priority: Major
> Labels: ignite-3
> Time Spent: 1.5h
> Remaining Estimate: 0h
>
> h3. *Motivation*
> After https://issues.apache.org/jira/browse/IGNITE-18088 test from
> ItIgniteNodeRestartTest started to fail. Need to investigate the reason of
> fails
> {noformat}
> Caused by: java.util.concurrent.CancellationException
> at
> java.base/java.util.concurrent.CompletableFuture.cancel(CompletableFuture.java:2396)
> at
> org.apache.ignite.internal.raft.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:491)
> at
> org.apache.ignite.internal.raft.RaftGroupServiceImpl.sendWithRetry(RaftGroupServiceImpl.java:473)
> at
> org.apache.ignite.internal.raft.RaftGroupServiceImpl.refreshAndGetLeaderWithTerm(RaftGroupServiceImpl.java:232)
> at
> org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.ensureReplicaIsPrimary(PartitionReplicaListener.java:1880)
> at
> org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.invoke(PartitionReplicaListener.java:275)
> at
> org.apache.ignite.internal.replicator.Replica.processRequest(Replica.java:61)
> at
> org.apache.ignite.internal.replicator.ReplicaManager.lambda$new$5(ReplicaManager.java:162)
> at
> org.apache.ignite.network.DefaultMessagingService.sendToSelf(DefaultMessagingService.java:279)
> at
> org.apache.ignite.network.DefaultMessagingService.invoke0(DefaultMessagingService.java:226)
> at
> org.apache.ignite.network.DefaultMessagingService.invoke(DefaultMessagingService.java:154)
> at
> org.apache.ignite.internal.replicator.ReplicaService.sendToReplica(ReplicaService.java:91)
> at
> org.apache.ignite.internal.replicator.ReplicaService.invoke(ReplicaService.java:178)
> at
> org.apache.ignite.internal.tx.impl.TxManagerImpl.cleanup(TxManagerImpl.java:150)
> at
> org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.lambda$processTxFinishAction$42(PartitionReplicaListener.java:984)
> at
> java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1072)
> at
> java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
> at
> java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073)
> at
> org.apache.ignite.internal.raft.RaftGroupServiceImpl.lambda$sendWithRetry$38(RaftGroupServiceImpl.java:526)
> at
> java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859)
> at
> java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837)
> at
> java.base/java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:479)
> at
> java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
> at
> java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
> at
> java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
> at
> java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
> at
> java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183){noformat}
> The root cause is that distribution zones data nodes watch listener in
> TableManager invokes updatePendingAssignmentsKeys with the same data nodes as
> in table configuration assignments so RaftGroupServiceImpl stops and new the
> same RaftGroupServiceImpl starts. It cancels invokes to old
> RaftGroupServiceImpl so we have CancellationException at sendWithRetry.
> h3. *Implementation Notes*
> It is not need to change the assignments if the calculated assignments equals
> to the assignments in the table configuration.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)