Hello all, I have just had a problem with a NameNode restart and someone on the mailing list kindly suggested that the edits file was corrupted. I have made a backup copy of the file and checked my /namesecondary/previous.checkpoint but the edits file there is empty 4kb with ????? inside.
This suggest to me that I cannot recover from the secondaryNameNode? How do you fix this problem? Thanks for your help. Original error log: TARTUP_MSG: build =https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010 ************************************************************/ 2012-07-30 16:02:23,649 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=NameNode, port=50001 2012-07-30 16:02:23,656 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at: localhost/127.0.0.1:50001 2012-07-30 16:02:23,659 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=NameNode, sessionId=null 2012-07-30 16:02:23,660 INFO org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext 2012-07-30 16:02:23,714 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=hadoop,hadoop 2012-07-30 16:02:23,714 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup 2012-07-30 16:02:23,714 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=false 2012-07-30 16:02:23,721 INFO org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics: Initializing FSNamesystemMetrics using context object:org.apache.hadoop.metrics.spi.NullContext 2012-07-30 16:02:23,723 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemStatusMBean 2012-07-30 16:02:23,756 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files = 533 2012-07-30 16:02:23,833 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files under construction = 2 2012-07-30 16:02:23,835 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 55400 loaded in 0 seconds. 2012-07-30 16:02:23,844 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.lang.NumberFormatException: For input string: "1343506" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Long.parseLong(Long.java:419) at java.lang.Long.parseLong(Long.java:468) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.readLong(FSEditLog.java:1273) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:775) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:992) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:812) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:364) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965) 2012-07-30 16:02:23,845 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: Mouradk Mouradk Sent with Sparrow (http://www.sparrowmailapp.com/?sig)