If DataNode gets killed after 'data.dir' is created, but before LAYOUTVERSION is written to the storage file. The further restarts of the DataNode, an EOFException will be thrown while reading the storage file. -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Key: HDFS-1887 URL: https://issues.apache.org/jira/browse/HDFS-1887 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.21.0, 0.20.1, 0.23.0 Environment: Linux Reporter: sravankorumilli Priority: Minor Assume DataNode gets killed after 'data.dir' is created, but before LAYOUTVERSION is written to the storage file. The further restarts of the DataNode, an EOFException will be thrown while reading the storage file. The DataNode cannot be restarted successfully until the 'data.dir' is deleted. These are the corresponding logs:- 2011-05-02 19:12:19,389 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.EOFException at java.io.RandomAccessFile.readInt(RandomAccessFile.java:725) at org.apache.hadoop.hdfs.server.datanode.DataStorage.isConversionNeeded(DataStorage.java:203) at org.apache.hadoop.hdfs.server.common.Storage.checkConversionNeeded(Storage.java:697) at org.apache.hadoop.hdfs.server.common.Storage.access$000(Storage.java:62) at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:476) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:116) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:336) at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:260) at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:237) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1440) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1393) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1407) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1552) Our Hadoop cluster is managed by a cluster management software which tries to eliminate any manual intervention in setting up & managing the cluster. But in the above mentioned scenario, it requires manual intervention to recover the DataNode. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira