[ https://issues.apache.org/jira/browse/IGNITE-21766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kirill Sizov updated IGNITE-21766: ----------------------------------- Attachment: Compute Integration Tests 3626 Mar 13.zip > AssertionError is thrown from jraft > ----------------------------------- > > Key: IGNITE-21766 > URL: https://issues.apache.org/jira/browse/IGNITE-21766 > Project: Ignite > Issue Type: Bug > Reporter: Kirill Sizov > Assignee: Kirill Sizov > Priority: Major > Labels: ignite-3 > Attachments: Compute Integration Tests 3626 Mar 13.zip > > > One of the test runs failed with java.lang.AssertionError: null. > Seems like there is a race in jraft. > One threads tries to take a node while another one concurrently removes it. > > {noformat} > [2024-03-13T04:56:03,164][INFO ][%itcwst_cews_2%Raft-Group-Client-11][Loza] > Stop raft group=8_part_0 > [2024-03-13T04:56:03,165][INFO > ][%itcwst_cews_2%Raft-Group-Client-11][NodeImpl] Node > <8_part_0/itcwst_cews_2> shutdown, currTerm=1 state=STATE_LEADER. > [2024-03-13T04:56:03,165][INFO > ][%itcwst_cews_2%JRaft-FSMCaller-Disruptor_stripe_7-0][StateMachineAdapter] > onLeaderStop: status=Status[ESHUTDOWN<1007>: Raft node is going to quit.]. > [2024-03-13T04:56:03,165][INFO > ][%itcwst_cews_2%Raft-Group-Client-11][Replicator] Replicator Replicator > [state=Replicate, statInfo=<running=IDLE, firstLogIndex=7, lastLogIncluded=0, > lastLogIndex=7, lastTermIncluded=0>, peerId=itcwst_cews_1, type=Follower] is > going to quit > [2024-03-13T04:56:03,166][INFO > ][%itcwst_cews_0%JRaft-Request-Processor-15][NodeImpl] Node > <8_part_0/itcwst_cews_0> received TimeoutNowRequest from itcwst_cews_2, > term=1 and starts voting. > [2024-03-13T04:56:03,166][INFO > ][%itcwst_cews_0%JRaft-Request-Processor-15][NodeImpl] Node > <8_part_0/itcwst_cews_0> start vote and grant vote self, term=1. > [2024-03-13T04:56:03,166][INFO > ][%itcwst_cews_0%JRaft-FSMCaller-Disruptor_stripe_1-0][StateMachineAdapter] > onStopFollowing: LeaderChangeContext [leaderId=itcwst_cews_2, term=1, > status=Status[ERAFTTIMEDOUT<10001>: A follower's leader_id is reset to NULL > as it begins to request_vote.]]. > [2024-03-13T04:56:03,166][INFO > ][%itcwst_cews_2%Raft-Group-Client-11][LocalRaftMetaStorage] Save raft meta, > path=build/work/ItThinClientWorkerShutdownTest/colocatedExecutionWorkerShutdown_34532261536989015/itcwst_cews_2/8_part_0_itcwst_cews_2_0/meta, > term=1, votedFor=itcwst_cews_2, cost time=1 ms > [2024-03-13T04:56:03,166][INFO > ][%itcwst_cews_2%Raft-Group-Client-11][FSMCallerImpl] Shutting down > FSMCaller... > [2024-03-13T04:56:03,166][INFO > ][%itcwst_cews_2%JRaft-Response-Processor-6][Replicator] Replicator > Replicator [state=Replicate, statInfo=<running=IDLE, firstLogIndex=7, > lastLogIncluded=0, lastLogIndex=7, lastTermIncluded=0>, peerId=itcwst_cews_0, > type=Follower] is going to quit > [2024-03-13T04:56:03,170][INFO > ][%itcwst_cews_0%JRaft-Request-Processor-15][LocalRaftMetaStorage] Save raft > meta, > path=build/work/ItThinClientWorkerShutdownTest/colocatedExecutionWorkerShutdown_34532261536989015/itcwst_cews_0/8_part_0_itcwst_cews_0_0/meta, > term=2, votedFor=itcwst_cews_0, cost time=3 ms > [2024-03-13T04:56:03,170][INFO > ][%itcwst_cews_1%JRaft-Request-Processor-4][NodeImpl] Node > <8_part_0/itcwst_cews_1> received RequestVoteRequest from itcwst_cews_0, > term=2, currTerm=1. > [2024-03-13T04:56:03,170][INFO > ][%itcwst_cews_1%JRaft-FSMCaller-Disruptor_stripe_0-0][StateMachineAdapter] > onStopFollowing: LeaderChangeContext [leaderId=itcwst_cews_2, term=1, > status=Status[EHIGHERTERMRESPONSE<10008>: Raft node receives higher term > RequestVoteRequest.]]. > [2024-03-13T04:56:03,171][INFO > ][%itcwst_cews_1%JRaft-Request-Processor-4][LocalRaftMetaStorage] Save raft > meta, > path=build/work/ItThinClientWorkerShutdownTest/colocatedExecutionWorkerShutdown_34532261536989015/itcwst_cews_1/8_part_0_itcwst_cews_1_0/meta, > term=2, votedFor=, cost time=1 ms > [2024-03-13T04:56:03,172][INFO > ][%itcwst_cews_1%JRaft-Request-Processor-4][LocalRaftMetaStorage] Save raft > meta, > path=build/work/ItThinClientWorkerShutdownTest/colocatedExecutionWorkerShutdown_34532261536989015/itcwst_cews_1/8_part_0_itcwst_cews_1_0/meta, > term=2, votedFor=itcwst_cews_0, cost time=0 ms > [2024-03-13T04:56:03,173][INFO > ][%itcwst_cews_0%JRaft-Response-Processor-9][NodeImpl] Node > <8_part_0/itcwst_cews_0> become leader of group, term=2, > conf=itcwst_cews_2,itcwst_cews_1,itcwst_cews_0, oldConf=. > [2024-03-13T04:56:03,173][INFO > ][%itcwst_cews_0%JRaft-Response-Processor-9][Replicator] > Replicator=Replicator [state=Created, statInfo=<running=null, > firstLogIndex=0, lastLogIncluded=0, lastLogIndex=0, lastTermIncluded=0>, > peerId=itcwst_cews_2, type=Follower]@itcwst_cews_2 is started > [2024-03-13T04:56:03,173][INFO > ][%itcwst_cews_0%JRaft-Response-Processor-9][Replicator] > Replicator=Replicator [state=Created, statInfo=<running=null, > firstLogIndex=0, lastLogIncluded=0, lastLogIndex=0, lastTermIncluded=0>, > peerId=itcwst_cews_1, type=Follower]@itcwst_cews_1 is started > [2024-03-13T04:56:03,174][INFO > ][%itcwst_cews_2%Raft-Group-Client-11][RaftGroupService] Stop the > RaftGroupService successfully. > [2024-03-13T04:56:03,175][INFO > ][%itcwst_cews_1%JRaft-FSMCaller-Disruptor_stripe_0-0][StateMachineAdapter] > onStartFollowing: LeaderChangeContext [leaderId=itcwst_cews_0, term=2, > status=Status[ENEWLEADER<10011>: Follower receives message from new leader > with the same term.]]. > [2024-03-13T04:56:01,427][WARN > ][%itcwst_cews_2%JRaft-Response-Processor-3][NodeImpl] Node > <8_part_0/itcwst_cews_2> received invalid PreVoteResponse from itcwst_cews_0, > state not in STATE_FOLLOWER but STATE_CANDIDATE. > [2024-03-13T04:56:01,431][WARN > ][%itcwst_cews_2%JRaft-Response-Processor-5][NodeImpl] Node > <8_part_0/itcwst_cews_2> received invalid RequestVoteResponse from > itcwst_cews_1, state not in STATE_CANDIDATE but STATE_LEADER. > [2024-03-13T04:56:01,455][WARN > ][%itcwst_cews_1%metastorage-watch-executor-3][WatchProcessor] Watch event > processing has been too long [duration=1619, keys=[catalog.update.3, > catalog.version]] > [2024-03-13T04:56:01,456][WARN > ][%itcwst_cews_0%metastorage-watch-executor-0][WatchProcessor] Watch event > processing has been too long [duration=1615, keys=[catalog.update.3, > catalog.version]] > [2024-03-13T04:56:01,457][WARN > ][%itcwst_cews_2%metastorage-watch-executor-3][WatchProcessor] Watch event > processing has been too long [duration=1617, keys=[catalog.update.3, > catalog.version]] > [2024-03-13T04:56:02,979][WARN > ][%itcwst_cews_2%partition-operations-1][TrackableNetworkMessageHandler] > Message handling has been too long [duration=12ms, message=[class > org.apache.ignite.internal.tx.message.TxCleanupMessageImpl]] > [2024-03-13T04:56:03,165][WARN > ][%itcwst_cews_2%JRaft-Common-Executor-2][Replicator] Replicator Replicator > [state=Replicate, statInfo=<running=IDLE, firstLogIndex=7, lastLogIncluded=0, > lastLogIndex=7, lastTermIncluded=0>, peerId=itcwst_cews_0, type=Follower] > stops sending entries. > [2024-03-13T04:56:03,170][WARN > ][%itcwst_cews_2%JRaft-Request-Processor-1][NodeImpl] Node > <8_part_0/itcwst_cews_2> is not in active state, currTerm=1. > [2024-03-13T04:56:03,171][WARN > ][%itcwst_cews_0%JRaft-Response-Processor-8][NodeImpl] Node > <8_part_0/itcwst_cews_0> RequestVote to itcwst_cews_2 error: > Status[EINVAL<1015>: Node <8_part_0/itcwst_cews_2> is not in active state, > state STATE_SHUTTING.]. > [2024-03-13T04:56:03,176][ERROR][%itcwst_cews_2%MessagingService-inbound-0-0][DefaultMessagingService] > onMessage() failed while processing InvokeRequestImpl [correlationId=268, > message=AppendEntriesRequestImpl [committedIndex=6, > data=org.apache.ignite.raft.jraft.util.ByteString@1, entriesList=null, > groupId=8_part_0, peerId=itcwst_cews_2, prevLogIndex=7, prevLogTerm=1, > serverId=itcwst_cews_0, term=2, timestampLong=112086598495305728]] from > ClusterNodeImpl [id=ac6fb201-b427-4497-a494-9d5a3d8060eb, name=itcwst_cews_0, > address=10.233.123.18:3344, nodeMetadata=null] > java.lang.AssertionError: null > at > org.apache.ignite.raft.jraft.rpc.impl.core.AppendEntriesRequestProcessor.getOrCreatePeerRequestContext(AppendEntriesRequestProcessor.java:348) > ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] > at > org.apache.ignite.raft.jraft.rpc.impl.core.AppendEntriesRequestProcessor$PeerExecutorSelector.select(AppendEntriesRequestProcessor.java:72) > ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] > at > org.apache.ignite.raft.jraft.rpc.impl.IgniteRpcServer$RpcMessageHandler.onReceived(IgniteRpcServer.java:169) > ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] > at > org.apache.ignite.internal.network.TrackableNetworkMessageHandler.onReceived(TrackableNetworkMessageHandler.java:49) > ~[ignite-network-api-3.0.0-SNAPSHOT.jar:?] > at > org.apache.ignite.internal.network.DefaultMessagingService.handleStartingWithFirstHandler(DefaultMessagingService.java:494) > ~[ignite-network-3.0.0-SNAPSHOT.jar:?] > at > org.apache.ignite.internal.network.DefaultMessagingService.lambda$handleMessageFromNetwork$5(DefaultMessagingService.java:401) > ~[ignite-network-3.0.0-SNAPSHOT.jar:?] > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > [?:?] > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > [?:?] > at java.base/java.lang.Thread.run(Thread.java:834) [?:?] > [2024-03-13T04:56:03,190][WARN > ][%itcwst_cews_2%MessagingService-inbound-0-0][DefaultMessagingService] > Uncaught exception [thread=IgniteThread > [name=%itcwst_cews_2%MessagingService-inbound-0-0]] > java.lang.AssertionError: null > at > org.apache.ignite.raft.jraft.rpc.impl.core.AppendEntriesRequestProcessor.getOrCreatePeerRequestContext(AppendEntriesRequestProcessor.java:348) > ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] > at > org.apache.ignite.raft.jraft.rpc.impl.core.AppendEntriesRequestProcessor$PeerExecutorSelector.select(AppendEntriesRequestProcessor.java:72) > ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] > at > org.apache.ignite.raft.jraft.rpc.impl.IgniteRpcServer$RpcMessageHandler.onReceived(IgniteRpcServer.java:169) > ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] > at > org.apache.ignite.internal.network.TrackableNetworkMessageHandler.onReceived(TrackableNetworkMessageHandler.java:49) > ~[ignite-network-api-3.0.0-SNAPSHOT.jar:?] > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)