J.Andreina created HDFS-7952: -------------------------------- Summary: On starting Standby with "rollback" option, lastPromisedEpoch gets updated and Active Namenode is shutting down. Key: HDFS-7952 URL: https://issues.apache.org/jira/browse/HDFS-7952 Project: Hadoop HDFS Issue Type: Bug Reporter: J.Andreina Assignee: J.Andreina Priority: Critical
Step 1: Start NN1 as active , NN2 as standby . Step 2: Perform "hdfs dfsadmin -rollingUpgrade prepare" Step 3: Start NN2 active and NN1 as standby with rolling upgrade started option. Step 4: DN also restarted in upgrade mode and write files to hdfs Step 5: Stop both Namenode and DN Step 6: Restart NN2 as active and NN1 as standby with rolling upgrade rollback option. Issue: ===== On restarting NN1 as standby with "rollback" option , lastPromisedEpoch gets updated and active NN2 is shutting down with following exception. {noformat} 15/03/18 16:25:56 FATAL namenode.FSEditLog: Error: flush failed for required journal (JournalAndStream(mgr=QJM to [XXXXXXXXXXX:8485, YYYYYYYYYYY:8485], stream=QuorumOutputStream starting at txid 22)) org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many exceptions to achieve quorum size 2/2. 2 exceptions thrown: XXXXXXXXXXX:8485: IPC's epoch 5 is less than the last promised epoch 6 at org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:418) at org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:446) at org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:341) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)