[
https://issues.apache.org/jira/browse/HDDS-10595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17937031#comment-17937031
]
Ivan Andika commented on HDDS-10595:
------------------------------------
[~jyosin] Is this still reproducible?
> OM node is not shown in 'om roles' as follower node
This might be because there are some issues in the OM Ratis group
configuration. Maybe there was some mistaken decommissioning or wrong
configuration, or maybe the OM om363 is still bootstrapping, not entirely sure.
The related logs with "IN_PROGRESS" should be because the OM om363 downloading
the OM DB from the leader, because om363 doesn't have the logs in the OM DB
(the leader OM Raft log has been purged) which will trigger
notifyInstallSnapshotFromLeader, during this time all AppendEntries will throw
these exceptions. So it should be normal unless this happens longer than
expected.
> [snapshot-LR] OM node is not shown in 'om roles' as follower node
> -----------------------------------------------------------------
>
> Key: HDDS-10595
> URL: https://issues.apache.org/jira/browse/HDDS-10595
> Project: Apache Ozone
> Issue Type: Bug
> Components: OM, Snapshot
> Reporter: Jyotirmoy Sinha
> Priority: Major
> Labels: ozone-snapshot
>
> OM node is not shown in 'om roles' as follower node
> OM roles - CLI -
> {code:java}
> [root@node~]# ozone admin om roles -id=ozone1709211738
> om233 : FOLLOWER (node1.domain.com)
> om232 : LEADER (node2.domain.com) {code}
> OM log output -
> {code:java}
> 2024-03-19 00:00:01,203 INFO
> [om363-server-thread17]-org.apache.ratis.server.RaftServer$Division:
> om363@group-A826AF593A36: Failed appendEntries as snapshot (30450723)
> installation is in progress
> 2024-03-19 00:00:01,203 INFO
> [om363-server-thread17]-org.apache.ratis.server.RaftServer$Division:
> om363@group-A826AF593A36: inconsistency entries.
> Reply:om232<-om363#14732148:FAIL-t54,INCONSISTENCY,nextIndex=0,followerCommit=-1,matchIndex=-1
> 2024-03-19 00:00:01,203 INFO
> [grpc-default-executor-6]-org.apache.ratis.server.impl.SnapshotInstallationHandler:
> om363@group-A826AF593A36: receive installSnapshot:
> om232->om234#0-t54,notify:(t:52, i:30450723)
> 2024-03-19 00:00:01,203 INFO
> [grpc-default-executor-6]-org.apache.ratis.server.impl.SnapshotInstallationHandler:
> om363@group-A826AF593A36: reply installSnapshot:
> om232<-om363#0:FAIL-t54,IN_PROGRESS
> 2024-03-19 00:00:01,203 INFO
> [grpc-default-executor-6]-org.apache.ratis.grpc.server.GrpcServerProtocolService:
> om363: Completed INSTALL_SNAPSHOT, lastRequest:
> om232->om234#0-t54,notify:(t:52, i:30450723)
> 2024-03-19 00:00:01,203 INFO
> [grpc-default-executor-6]-org.apache.ratis.grpc.server.GrpcServerProtocolService:
> om363: Completed INSTALL_SNAPSHOT, lastReply: null
> 2024-03-19 00:00:01,203 INFO
> [om363-server-thread17]-org.apache.ratis.server.RaftServer$Division:
> om363@group-A826AF593A36: Failed appendEntries as snapshot (30450723)
> installation is in progress
> 2024-03-19 00:00:01,203 INFO
> [om363-server-thread17]-org.apache.ratis.server.RaftServer$Division:
> om363@group-A826AF593A36: inconsistency entries.
> Reply:om232<-om363#14732149:FAIL-t54,INCONSISTENCY,nextIndex=0,followerCommit=-1,matchIndex=-1
> 2024-03-19 00:00:01,203 INFO
> [grpc-default-executor-6]-org.apache.ratis.server.impl.SnapshotInstallationHandler:
> om363@group-A826AF593A36: receive installSnapshot:
> om232->om234#0-t54,notify:(t:52, i:30450723)
> 2024-03-19 00:00:01,203 INFO
> [grpc-default-executor-6]-org.apache.ratis.server.impl.SnapshotInstallationHandler:
> om363@group-A826AF593A36: reply installSnapshot:
> om232<-om363#0:FAIL-t54,IN_PROGRESS
> 2024-03-19 00:00:01,203 INFO
> [grpc-default-executor-6]-org.apache.ratis.grpc.server.GrpcServerProtocolService:
> om363: Completed INSTALL_SNAPSHOT, lastRequest:
> om232->om234#0-t54,notify:(t:52, i:30450723)
> 2024-03-19 00:00:01,203 INFO
> [grpc-default-executor-6]-org.apache.ratis.grpc.server.GrpcServerProtocolService:
> om363: Completed INSTALL_SNAPSHOT, lastReply: null
> 2024-03-19 00:00:01,204 INFO
> [om363-server-thread17]-org.apache.ratis.server.RaftServer$Division:
> om363@group-A826AF593A36: Failed appendEntries as snapshot (30450723)
> installation is in progress
> 2024-03-19 00:00:01,204 INFO
> [om363-server-thread17]-org.apache.ratis.server.RaftServer$Division:
> om363@group-A826AF593A36: inconsistency entries.
> Reply:om232<-om363#14732150:FAIL-t54,INCONSISTENCY,nextIndex=0,followerCommit=-1,matchIndex=-1
> 2024-03-19 00:00:01,204 INFO
> [grpc-default-executor-7]-org.apache.ratis.server.impl.SnapshotInstallationHandler:
> om363@group-A826AF593A36: receive installSnapshot:
> om232->om234#0-t54,notify:(t:52, i:30450723)
> 2024-03-19 00:00:01,204 INFO
> [grpc-default-executor-7]-org.apache.ratis.server.impl.SnapshotInstallationHandler:
> om363@group-A826AF593A36: reply installSnapshot:
> om232<-om363#0:FAIL-t54,IN_PROGRESS
> 2024-03-19 00:00:01,204 INFO
> [grpc-default-executor-7]-org.apache.ratis.grpc.server.GrpcServerProtocolService:
> om363: Completed INSTALL_SNAPSHOT, lastRequest:
> om232->om234#0-t54,notify:(t:52, i:30450723)
> 2024-03-19 00:00:01,204 INFO
> [grpc-default-executor-7]-org.apache.ratis.grpc.server.GrpcServerProtocolService:
> om363: Completed INSTALL_SNAPSHOT, lastReply: null
> 2024-03-19 00:00:01,204 INFO
> [om363-server-thread17]-org.apache.ratis.server.RaftServer$Division:
> om363@group-A826AF593A36: Failed appendEntries as snapshot (30450723)
> installation is in progress {code}
> OM Leader log output pertaining to above OM Follower -
> {code:java}
> 2024-03-19 00:03:11,531 INFO
> [grpc-default-executor-75]-org.apache.ratis.grpc.server.GrpcLogAppender:
> om232@group-A826AF593A36->om234-InstallSnapshotResponseHandler: received a
> reply om232<-om363#0:FAIL-t54,IN_PROGRESS
> 2024-03-19 00:03:11,531 INFO
> [grpc-default-executor-75]-org.apache.ratis.grpc.server.GrpcLogAppender:
> om232@group-A826AF593A36->om234-InstallSnapshotResponseHandler: received a
> reply om232<-om363#0:FAIL-t54,IN_PROGRESS
> 2024-03-19 00:03:11,532 INFO
> [grpc-default-executor-75]-org.apache.ratis.grpc.server.GrpcLogAppender:
> om232@group-A826AF593A36->om234-InstallSnapshotResponseHandler: received a
> reply om232<-om363#0:FAIL-t54,IN_PROGRESS
> 2024-03-19 00:03:11,532 INFO
> [grpc-default-executor-75]-org.apache.ratis.grpc.server.GrpcLogAppender:
> om232@group-A826AF593A36->om234-InstallSnapshotResponseHandler: received a
> reply om232<-om363#0:FAIL-t54,IN_PROGRESS
> 2024-03-19 00:03:11,533 INFO
> [grpc-default-executor-75]-org.apache.ratis.grpc.server.GrpcLogAppender:
> om232@group-A826AF593A36->om234-InstallSnapshotResponseHandler: received a
> reply om232<-om363#0:FAIL-t54,IN_PROGRESS
> 2024-03-19 00:03:11,534 INFO
> [grpc-default-executor-26]-org.apache.ratis.grpc.server.GrpcLogAppender:
> om232@group-A826AF593A36->om234-InstallSnapshotResponseHandler: received a
> reply om232<-om363#0:FAIL-t54,IN_PROGRESS
> 2024-03-19 00:03:11,534 INFO
> [grpc-default-executor-26]-org.apache.ratis.grpc.server.GrpcLogAppender:
> om232@group-A826AF593A36->om234-InstallSnapshotResponseHandler: received a
> reply om232<-om363#0:FAIL-t54,IN_PROGRESS
> 2024-03-19 00:03:11,535 INFO
> [grpc-default-executor-75]-org.apache.ratis.grpc.server.GrpcLogAppender:
> om232@group-A826AF593A36->om234-InstallSnapshotResponseHandler: received a
> reply om232<-om363#0:FAIL-t54,IN_PROGRESS {code}
> From logs the snapshot installation is failing in the the follower OM, this
> log is flooding in the follower OM and similar error is present in leader
> indicating the snapshot-installation is failing/in-progress.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]