[ 
https://issues.apache.org/jira/browse/IGNITE-21766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

 Kirill Sizov updated IGNITE-21766:
-----------------------------------
    Attachment: Compute Integration Tests 3626 Mar 13.zip

> AssertionError is thrown from jraft
> -----------------------------------
>
>                 Key: IGNITE-21766
>                 URL: https://issues.apache.org/jira/browse/IGNITE-21766
>             Project: Ignite
>          Issue Type: Bug
>            Reporter:  Kirill Sizov
>            Assignee:  Kirill Sizov
>            Priority: Major
>              Labels: ignite-3
>         Attachments: Compute Integration Tests 3626 Mar 13.zip
>
>
> One of the test runs failed with java.lang.AssertionError: null.
> Seems like there is a race in jraft.
> One threads tries to take a node while another one concurrently removes it.
>  
> {noformat}
> [2024-03-13T04:56:03,164][INFO ][%itcwst_cews_2%Raft-Group-Client-11][Loza] 
> Stop raft group=8_part_0
> [2024-03-13T04:56:03,165][INFO 
> ][%itcwst_cews_2%Raft-Group-Client-11][NodeImpl] Node 
> <8_part_0/itcwst_cews_2> shutdown, currTerm=1 state=STATE_LEADER.
> [2024-03-13T04:56:03,165][INFO 
> ][%itcwst_cews_2%JRaft-FSMCaller-Disruptor_stripe_7-0][StateMachineAdapter] 
> onLeaderStop: status=Status[ESHUTDOWN<1007>: Raft node is going to quit.].
> [2024-03-13T04:56:03,165][INFO 
> ][%itcwst_cews_2%Raft-Group-Client-11][Replicator] Replicator Replicator 
> [state=Replicate, statInfo=<running=IDLE, firstLogIndex=7, lastLogIncluded=0, 
> lastLogIndex=7, lastTermIncluded=0>, peerId=itcwst_cews_1, type=Follower] is 
> going to quit
> [2024-03-13T04:56:03,166][INFO 
> ][%itcwst_cews_0%JRaft-Request-Processor-15][NodeImpl] Node 
> <8_part_0/itcwst_cews_0> received TimeoutNowRequest from itcwst_cews_2, 
> term=1 and starts voting.
> [2024-03-13T04:56:03,166][INFO 
> ][%itcwst_cews_0%JRaft-Request-Processor-15][NodeImpl] Node 
> <8_part_0/itcwst_cews_0> start vote and grant vote self, term=1.
> [2024-03-13T04:56:03,166][INFO 
> ][%itcwst_cews_0%JRaft-FSMCaller-Disruptor_stripe_1-0][StateMachineAdapter] 
> onStopFollowing: LeaderChangeContext [leaderId=itcwst_cews_2, term=1, 
> status=Status[ERAFTTIMEDOUT<10001>: A follower's leader_id is reset to NULL 
> as it begins to request_vote.]].
> [2024-03-13T04:56:03,166][INFO 
> ][%itcwst_cews_2%Raft-Group-Client-11][LocalRaftMetaStorage] Save raft meta, 
> path=build/work/ItThinClientWorkerShutdownTest/colocatedExecutionWorkerShutdown_34532261536989015/itcwst_cews_2/8_part_0_itcwst_cews_2_0/meta,
>  term=1, votedFor=itcwst_cews_2, cost time=1 ms
> [2024-03-13T04:56:03,166][INFO 
> ][%itcwst_cews_2%Raft-Group-Client-11][FSMCallerImpl] Shutting down 
> FSMCaller...
> [2024-03-13T04:56:03,166][INFO 
> ][%itcwst_cews_2%JRaft-Response-Processor-6][Replicator] Replicator 
> Replicator [state=Replicate, statInfo=<running=IDLE, firstLogIndex=7, 
> lastLogIncluded=0, lastLogIndex=7, lastTermIncluded=0>, peerId=itcwst_cews_0, 
> type=Follower] is going to quit
> [2024-03-13T04:56:03,170][INFO 
> ][%itcwst_cews_0%JRaft-Request-Processor-15][LocalRaftMetaStorage] Save raft 
> meta, 
> path=build/work/ItThinClientWorkerShutdownTest/colocatedExecutionWorkerShutdown_34532261536989015/itcwst_cews_0/8_part_0_itcwst_cews_0_0/meta,
>  term=2, votedFor=itcwst_cews_0, cost time=3 ms
> [2024-03-13T04:56:03,170][INFO 
> ][%itcwst_cews_1%JRaft-Request-Processor-4][NodeImpl] Node 
> <8_part_0/itcwst_cews_1> received RequestVoteRequest from itcwst_cews_0, 
> term=2, currTerm=1.
> [2024-03-13T04:56:03,170][INFO 
> ][%itcwst_cews_1%JRaft-FSMCaller-Disruptor_stripe_0-0][StateMachineAdapter] 
> onStopFollowing: LeaderChangeContext [leaderId=itcwst_cews_2, term=1, 
> status=Status[EHIGHERTERMRESPONSE<10008>: Raft node receives higher term 
> RequestVoteRequest.]].
> [2024-03-13T04:56:03,171][INFO 
> ][%itcwst_cews_1%JRaft-Request-Processor-4][LocalRaftMetaStorage] Save raft 
> meta, 
> path=build/work/ItThinClientWorkerShutdownTest/colocatedExecutionWorkerShutdown_34532261536989015/itcwst_cews_1/8_part_0_itcwst_cews_1_0/meta,
>  term=2, votedFor=, cost time=1 ms
> [2024-03-13T04:56:03,172][INFO 
> ][%itcwst_cews_1%JRaft-Request-Processor-4][LocalRaftMetaStorage] Save raft 
> meta, 
> path=build/work/ItThinClientWorkerShutdownTest/colocatedExecutionWorkerShutdown_34532261536989015/itcwst_cews_1/8_part_0_itcwst_cews_1_0/meta,
>  term=2, votedFor=itcwst_cews_0, cost time=0 ms
> [2024-03-13T04:56:03,173][INFO 
> ][%itcwst_cews_0%JRaft-Response-Processor-9][NodeImpl] Node 
> <8_part_0/itcwst_cews_0> become leader of group, term=2, 
> conf=itcwst_cews_2,itcwst_cews_1,itcwst_cews_0, oldConf=.
> [2024-03-13T04:56:03,173][INFO 
> ][%itcwst_cews_0%JRaft-Response-Processor-9][Replicator] 
> Replicator=Replicator [state=Created, statInfo=<running=null, 
> firstLogIndex=0, lastLogIncluded=0, lastLogIndex=0, lastTermIncluded=0>, 
> peerId=itcwst_cews_2, type=Follower]@itcwst_cews_2 is started
> [2024-03-13T04:56:03,173][INFO 
> ][%itcwst_cews_0%JRaft-Response-Processor-9][Replicator] 
> Replicator=Replicator [state=Created, statInfo=<running=null, 
> firstLogIndex=0, lastLogIncluded=0, lastLogIndex=0, lastTermIncluded=0>, 
> peerId=itcwst_cews_1, type=Follower]@itcwst_cews_1 is started
> [2024-03-13T04:56:03,174][INFO 
> ][%itcwst_cews_2%Raft-Group-Client-11][RaftGroupService] Stop the 
> RaftGroupService successfully.
> [2024-03-13T04:56:03,175][INFO 
> ][%itcwst_cews_1%JRaft-FSMCaller-Disruptor_stripe_0-0][StateMachineAdapter] 
> onStartFollowing: LeaderChangeContext [leaderId=itcwst_cews_0, term=2, 
> status=Status[ENEWLEADER<10011>: Follower receives message from new leader 
> with the same term.]].
> [2024-03-13T04:56:01,427][WARN 
> ][%itcwst_cews_2%JRaft-Response-Processor-3][NodeImpl] Node 
> <8_part_0/itcwst_cews_2> received invalid PreVoteResponse from itcwst_cews_0, 
> state not in STATE_FOLLOWER but STATE_CANDIDATE.
> [2024-03-13T04:56:01,431][WARN 
> ][%itcwst_cews_2%JRaft-Response-Processor-5][NodeImpl] Node 
> <8_part_0/itcwst_cews_2> received invalid RequestVoteResponse from 
> itcwst_cews_1, state not in STATE_CANDIDATE but STATE_LEADER.
> [2024-03-13T04:56:01,455][WARN 
> ][%itcwst_cews_1%metastorage-watch-executor-3][WatchProcessor] Watch event 
> processing has been too long [duration=1619, keys=[catalog.update.3, 
> catalog.version]]
> [2024-03-13T04:56:01,456][WARN 
> ][%itcwst_cews_0%metastorage-watch-executor-0][WatchProcessor] Watch event 
> processing has been too long [duration=1615, keys=[catalog.update.3, 
> catalog.version]]
> [2024-03-13T04:56:01,457][WARN 
> ][%itcwst_cews_2%metastorage-watch-executor-3][WatchProcessor] Watch event 
> processing has been too long [duration=1617, keys=[catalog.update.3, 
> catalog.version]]
> [2024-03-13T04:56:02,979][WARN 
> ][%itcwst_cews_2%partition-operations-1][TrackableNetworkMessageHandler] 
> Message handling has been too long [duration=12ms, message=[class 
> org.apache.ignite.internal.tx.message.TxCleanupMessageImpl]]
> [2024-03-13T04:56:03,165][WARN 
> ][%itcwst_cews_2%JRaft-Common-Executor-2][Replicator] Replicator Replicator 
> [state=Replicate, statInfo=<running=IDLE, firstLogIndex=7, lastLogIncluded=0, 
> lastLogIndex=7, lastTermIncluded=0>, peerId=itcwst_cews_0, type=Follower] 
> stops sending entries.
> [2024-03-13T04:56:03,170][WARN 
> ][%itcwst_cews_2%JRaft-Request-Processor-1][NodeImpl] Node 
> <8_part_0/itcwst_cews_2> is not in active state, currTerm=1.
> [2024-03-13T04:56:03,171][WARN 
> ][%itcwst_cews_0%JRaft-Response-Processor-8][NodeImpl] Node 
> <8_part_0/itcwst_cews_0> RequestVote to itcwst_cews_2 error: 
> Status[EINVAL<1015>: Node <8_part_0/itcwst_cews_2> is not in active state, 
> state STATE_SHUTTING.].
> [2024-03-13T04:56:03,176][ERROR][%itcwst_cews_2%MessagingService-inbound-0-0][DefaultMessagingService]
>  onMessage() failed while processing InvokeRequestImpl [correlationId=268, 
> message=AppendEntriesRequestImpl [committedIndex=6, 
> data=org.apache.ignite.raft.jraft.util.ByteString@1, entriesList=null, 
> groupId=8_part_0, peerId=itcwst_cews_2, prevLogIndex=7, prevLogTerm=1, 
> serverId=itcwst_cews_0, term=2, timestampLong=112086598495305728]] from 
> ClusterNodeImpl [id=ac6fb201-b427-4497-a494-9d5a3d8060eb, name=itcwst_cews_0, 
> address=10.233.123.18:3344, nodeMetadata=null]
> java.lang.AssertionError: null
>   at 
> org.apache.ignite.raft.jraft.rpc.impl.core.AppendEntriesRequestProcessor.getOrCreatePeerRequestContext(AppendEntriesRequestProcessor.java:348)
>  ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.ignite.raft.jraft.rpc.impl.core.AppendEntriesRequestProcessor$PeerExecutorSelector.select(AppendEntriesRequestProcessor.java:72)
>  ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.ignite.raft.jraft.rpc.impl.IgniteRpcServer$RpcMessageHandler.onReceived(IgniteRpcServer.java:169)
>  ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.ignite.internal.network.TrackableNetworkMessageHandler.onReceived(TrackableNetworkMessageHandler.java:49)
>  ~[ignite-network-api-3.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.ignite.internal.network.DefaultMessagingService.handleStartingWithFirstHandler(DefaultMessagingService.java:494)
>  ~[ignite-network-3.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.ignite.internal.network.DefaultMessagingService.lambda$handleMessageFromNetwork$5(DefaultMessagingService.java:401)
>  ~[ignite-network-3.0.0-SNAPSHOT.jar:?]
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  [?:?]
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  [?:?]
>   at java.base/java.lang.Thread.run(Thread.java:834) [?:?]
> [2024-03-13T04:56:03,190][WARN 
> ][%itcwst_cews_2%MessagingService-inbound-0-0][DefaultMessagingService] 
> Uncaught exception [thread=IgniteThread 
> [name=%itcwst_cews_2%MessagingService-inbound-0-0]]
> java.lang.AssertionError: null
>   at 
> org.apache.ignite.raft.jraft.rpc.impl.core.AppendEntriesRequestProcessor.getOrCreatePeerRequestContext(AppendEntriesRequestProcessor.java:348)
>  ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.ignite.raft.jraft.rpc.impl.core.AppendEntriesRequestProcessor$PeerExecutorSelector.select(AppendEntriesRequestProcessor.java:72)
>  ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.ignite.raft.jraft.rpc.impl.IgniteRpcServer$RpcMessageHandler.onReceived(IgniteRpcServer.java:169)
>  ~[ignite-raft-3.0.0-SNAPSHOT.jar:?]
>   at 
> org.apache.ignite.internal.network.TrackableNetworkMessageHandler.onReceived(TrackableNetworkMessageHandler.java:49)
>  ~[ignite-network-api-3.0.0-SNAPSHOT.jar:?]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to