MaoYuan Xian created HDFS-5745: ---------------------------------- Summary: Unnecessary disk check triggered when socket operation has problem. Key: HDFS-5745 URL: https://issues.apache.org/jira/browse/HDFS-5745 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 1.2.1 Reporter: MaoYuan Xian
When BlockReceiver transfer data fails, it can be found SocketOutputStream translates the exception as IOException with the message "The stream is closed": 2014-01-06 11:48:04,716 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: IOException in BlockReceiver.run(): java.io.IOException: The stream is closed at org.apache.hadoop.net.SocketOutputStream.write at java.io.BufferedOutputStream.flushBuffer at java.io.BufferedOutputStream.flush at java.io.DataOutputStream.flush at org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run at java.lang.Thread.run Which makes the checkDiskError method of DataNode called and triggers the disk scan. Can we make the modifications like below in checkDiskError to avoiding this unneccessary disk scan operations?: {code} --- a/src/hdfs/org/apache/hadoop/hdfs/server/datanode/DataNode.java +++ b/src/hdfs/org/apache/hadoop/hdfs/server/datanode/DataNode.java @@ -938,7 +938,8 @@ public class DataNode extends Configured || e.getMessage().startsWith("An established connection was aborted") || e.getMessage().startsWith("Broken pipe") || e.getMessage().startsWith("Connection reset") - || e.getMessage().contains("java.nio.channels.SocketChannel")) { + || e.getMessage().contains("java.nio.channels.SocketChannel") + || e.getMessage().startsWith("The stream is closed")) { LOG.info("Not checking disk as checkDiskError was called on a network" + " related exception"); return; {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)