Bharat Viswanadham created HDDS-1031: ----------------------------------------
Summary: Update ratis version to fix a DN restart Bug Key: HDDS-1031 URL: https://issues.apache.org/jira/browse/HDDS-1031 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Bharat Viswanadham This is related to RATIS-460. When datanode is restarted, after ratis has taken a snapshot, we see below stack trace, and DN won't boot up. For more info refer RATIS-460 {code:java} java.io.IOException: java.lang.IllegalStateException: lastEntry = 72856=72856: [77969640-aad9-4678-813b-8fb35bd5f568:172.27.37.0:9858, 7c6ae4fe-7db5-4e97-a407-0a9edff70c2c:172.27.35.192:9858, add14303-ecdf-4aed-84b7-abc3152177f6:172.27.37.128:9858], old=null, lastEntry.index >= logIndex = 0 at org.apache.ratis.util.IOUtils.asIOException(IOUtils.java:54) at org.apache.ratis.util.IOUtils.toIOException(IOUtils.java:61) at org.apache.ratis.util.IOUtils.getFromFuture(IOUtils.java:70) at org.apache.ratis.server.impl.RaftServerProxy.getImpls(RaftServerProxy.java:283) at org.apache.ratis.server.impl.RaftServerProxy.start(RaftServerProxy.java:295) at org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis.start(XceiverServerRatis.java:427) at org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.start(OzoneContainer.java:149) at org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.start(DatanodeStateMachine.java:165) at org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.lambda$startDaemon$0(DatanodeStateMachine.java:334) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.IllegalStateException: lastEntry = 72856=72856: [77969640-aad9-4678-813b-8fb35bd5f568:172.27.37.0:9858, 7c6ae4fe-7db5-4e97-a407-0a9edff70c2c:172.27.35.192:9858, add14303-ecdf-4aed-84b7-abc3152177f6:172.27.37.128:9858], old=null, lastEntry.index >= logIndex = 0 at org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:72) at org.apache.ratis.server.impl.ConfigurationManager.addConfiguration(ConfigurationManager.java:54) at org.apache.ratis.server.impl.ServerState.setRaftConf(ServerState.java:352) at org.apache.ratis.server.impl.ServerState.setRaftConf(ServerState.java:347) at org.apache.ratis.server.storage.RaftLog.lambda$open$6(RaftLog.java:237) at org.apache.ratis.server.storage.LogSegment.lambda$loadSegment$0(LogSegment.java:140) at org.apache.ratis.server.storage.LogSegment.readSegmentFile(LogSegment.java:121) at org.apache.ratis.server.storage.LogSegment.loadSegment(LogSegment.java:137) at org.apache.ratis.server.storage.RaftLogCache.loadSegment(RaftLogCache.java:272) at org.apache.ratis.server.storage.SegmentedRaftLog.loadLogSegments(SegmentedRaftLog.java:159) at org.apache.ratis.server.storage.SegmentedRaftLog.openImpl(SegmentedRaftLog.java:129) at org.apache.ratis.server.storage.RaftLog.open(RaftLog.java:233) at org.apache.ratis.server.impl.ServerState.initLog(ServerState.java:191) at org.apache.ratis.server.impl.ServerState.<init>(ServerState.java:114) at org.apache.ratis.server.impl.RaftServerImpl.<init>(RaftServerImpl.java:103) at org.apache.ratis.server.impl.RaftServerProxy.lambda$newRaftServerImpl$2(RaftServerProxy.java:207) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) at java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1582) at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056) at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692) at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157) 2019-01-29 01:43:41,137 [main] ERROR - Exception in HddsDatanodeService. java.lang.NullPointerException at org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.join(DatanodeStateMachine.java:363) at org.apache.hadoop.ozone.HddsDatanodeService.join(HddsDatanodeService.java:270) at org.apache.hadoop.ozone.HddsDatanodeService.main(HddsDatanodeService.java:127) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org