ludun created HDFS-15774:
----------------------------

             Summary: DataNode DiskChecker treat all IOExeption as check failed
                 Key: HDFS-15774
                 URL: https://issues.apache.org/jira/browse/HDFS-15774
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: datanode
    Affects Versions: 3.1.4
            Reporter: ludun
            Assignee: ludun


DataNode DiskChecker treat all IOExeption as check failed.  But some 
IOException like "too many open files"  or  "can not create new thread"  is not 
about volume health state.

{code:java}
2021-01-11 19:17:10,751 | WARN  | Thread-121065 | Removing failed volume 
/srv/BigData/hadoop/data17/dn/current:  | FsVolumeList.java:247
org.apache.hadoop.util.DiskChecker$DiskErrorException: Directory is not 
writable: 
/srv/BigData/hadoop/data17/dn/current/BP-197188276-xxxxxxxx-1525514126952/current/finalized
        at 
org.apache.hadoop.util.DiskChecker.checkAccessByRWFile(DiskChecker.java:235)
        at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.checkDirs(BlockPoolSlice.java:346)
        at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.checkDirs(FsVolumeImpl.java:938)
        at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.checkDirs(FsVolumeList.java:245)
        at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.checkDataDir(FsDatasetImpl.java:2234)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.checkDiskError(DataNode.java:3537)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.access$900(DataNode.java:254)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode$8.run(DataNode.java:3571)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Too many open files
        at java.io.UnixFileSystem.createFileExclusively(Native Method)
        at java.io.File.createNewFile(File.java:1012)
        at 
org.apache.hadoop.util.DiskChecker.checkAccessByRWFile(DiskChecker.java:232)
        ... 8 more
{code}

we should treat IOExcption more precisely.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to