[ https://issues.apache.org/jira/browse/HDFS-7982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiaoyu Yao resolved HDFS-7982. ------------------------------ Resolution: Duplicate Fix Version/s: 2.7.0 > huge non dfs space used > ----------------------- > > Key: HDFS-7982 > URL: https://issues.apache.org/jira/browse/HDFS-7982 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Affects Versions: 2.6.0 > Reporter: regis le bretonnic > Fix For: 2.7.0 > > > Hi... > I'm trying to load an external textfile table into a internal orc table using > hive. My process failed with the following error : > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > org.apache.hadoop.ipc.RemoteException(java.io.IOException): File > /tmp/hive/blablabla.... could only be replicated to 0 nodes instead of > minReplication (=1). There are 3 datanode(s) running and no node(s) are > excluded in this operation. > After investigation, I saw that the quantity of "non dfs space" grows more > and more, until the job fails. > Just before failing, the "non dfs used space" reaches 54.GB on each datanode. > I still have space in "remaining DFS". > Here the dfsadmin report just before the issue : > [hdfs@hadoop-01 data]$ hadoop dfsadmin -report > DEPRECATED: Use of this script to execute hdfs command is deprecated. > Instead use the hdfs command for it. > Configured Capacity: 475193597952 (442.56 GB) > Present Capacity: 290358095182 (270.42 GB) > DFS Remaining: 228619903369 (212.92 GB) > DFS Used: 61738191813 (57.50 GB) > DFS Used%: 21.26% > Under replicated blocks: 38 > Blocks with corrupt replicas: 0 > Missing blocks: 0 > ------------------------------------------------- > Live datanodes (3): > Name: 192.168.3.36:50010 (hadoop-04.XXXXX.local) > Hostname: hadoop-04.XXXXX.local > Decommission Status : Normal > Configured Capacity: 158397865984 (147.52 GB) > DFS Used: 20591481196 (19.18 GB) > Non DFS Used: 61522602976 (57.30 GB) > DFS Remaining: 76283781812 (71.04 GB) > DFS Used%: 13.00% > DFS Remaining%: 48.16% > Configured Cache Capacity: 0 (0 B) > Cache Used: 0 (0 B) > Cache Remaining: 0 (0 B) > Cache Used%: 100.00% > Cache Remaining%: 0.00% > Xceivers: 182 > Last contact: Tue Mar 24 10:56:05 CET 2015 > Name: 192.168.3.35:50010 (hadoop-03.XXXXX.local) > Hostname: hadoop-03.XXXXX.local > Decommission Status : Normal > Configured Capacity: 158397865984 (147.52 GB) > DFS Used: 20555853589 (19.14 GB) > Non DFS Used: 61790296136 (57.55 GB) > DFS Remaining: 76051716259 (70.83 GB) > DFS Used%: 12.98% > DFS Remaining%: 48.01% > Configured Cache Capacity: 0 (0 B) > Cache Used: 0 (0 B) > Cache Remaining: 0 (0 B) > Cache Used%: 100.00% > Cache Remaining%: 0.00% > Xceivers: 184 > Last contact: Tue Mar 24 10:56:05 CET 2015 > Name: 192.168.3.37:50010 (hadoop-05.XXXXX.local) > Hostname: hadoop-05.XXXXX.local > Decommission Status : Normal > Configured Capacity: 158397865984 (147.52 GB) > DFS Used: 20590857028 (19.18 GB) > Non DFS Used: 61522603658 (57.30 GB) > DFS Remaining: 76284405298 (71.05 GB) > DFS Used%: 13.00% > DFS Remaining%: 48.16% > Configured Cache Capacity: 0 (0 B) > Cache Used: 0 (0 B) > Cache Remaining: 0 (0 B) > Cache Used%: 100.00% > Cache Remaining%: 0.00% > Xceivers: 182 > Last contact: Tue Mar 24 10:56:05 CET 2015 > I was expected to find a temporary space used within my filesystem (ie /data). > I found the DFS usage under /data/hadoop/hdfs/data (19GB) but no trace of > 57GB for non DFS... > [root@hadoop-05 hadoop]# df -h /data > Filesystem Size Used Avail Use% Mounted on > /dev/sdb1 148G 20G 121G 14% /data > I also checked dfs.datanode.du.reserved that is set to zero. > [root@hadoop-05 hadoop]# hdfs getconf -confkey dfs.datanode.du.reserved > 0 > Did I miss something ? Where is non DFS space on linux ? Why did I get this > message "could only be replicated to 0 nodes instead of minReplication (=1). > There are 3 datanode(s) running and no node(s) are excluded in this > operation." knowing that datanodes were up and running with still remaining > DFS space. > This error is blocking us. -- This message was sent by Atlassian JIRA (v6.3.4#6332)