Re: HDFS file system size issue

Saumitra Shahapure Tue, 15 Apr 2014 08:12:19 -0700

Hi Rahman,

These are few lines from hadoop fsck / -blocks -files -locations


/mnt/hadoop/hive/warehouse/user.db/table1/000255_0 44323326 bytes, 1
block(s):  OK
0. blk_-7919979022650423857_446500 len=44323326 repl=3 [ip1:50010,
ip2:50010, ip3:50010]

/mnt/hadoop/hive/warehouse/user.db/table1/000256_0 44566965 bytes, 1
block(s):  OK
0. blk_-5768999994812882540_446288 len=44566965 repl=3 [ip1:50010,
ip2:50010, ip4:50010]


Biswa may have guessed replication factor from fsck summary that I posted
earlier. I am posting it again for today's run:

Status: HEALTHY
 Total size:    58143055251 B
 Total dirs:    307
 Total files:   5093
 Total blocks (validated):      3903 (avg. block size 14897016 B)
 Minimally replicated blocks:   3903 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       92 (2.357161 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    2
 Average block replication:     3.1401486
 Corrupt blocks:                0
 Missing replicas:              92 (0.75065273 %)
 Number of data-nodes:          9
 Number of racks:               1
FSCK ended at Tue Apr 15 13:20:25 UTC 2014 in 655 milliseconds


The filesystem under path '/' is HEALTHY

I have not overridden dfs.datanode.du.reserved. It defaults to 0.

$ less $HADOOP_HOME/conf/hdfs-site.xml |grep -A3 'dfs.datanode.du.reserved'
$ less $HADOOP_HOME/src/hdfs/hdfs-default.xml |grep -A3
'dfs.datanode.du.reserved'
  <name>dfs.datanode.du.reserved</name>
  <value>0</value>
  <description>Reserved space in bytes per volume. Always leave this much
space free for non dfs use.
  </description>

Below is du -h on every node. FYI, my dfs.data.dir is /mnt/hadoop/dfs/data
and all hadoop/hive logs are dumped in /mnt/logs in various directories.
All machines have 400GB for /mnt.

$for i in `echo $dfs_slaves`; do  ssh $i 'du -sh /mnt/hadoop; du -sh
/mnt/hadoop/dfs/data; du -sh /mnt/logs;'; done


225G    /mnt/hadoop
224G    /mnt/hadoop/dfs/data
61M     /mnt/logs

281G    /mnt/hadoop
281G    /mnt/hadoop/dfs/data
63M     /mnt/logs

139G    /mnt/hadoop
139G    /mnt/hadoop/dfs/data
68M     /mnt/logs

135G    /mnt/hadoop
134G    /mnt/hadoop/dfs/data
92M     /mnt/logs

165G    /mnt/hadoop
164G    /mnt/hadoop/dfs/data
75M     /mnt/logs

137G    /mnt/hadoop
137G    /mnt/hadoop/dfs/data
95M     /mnt/logs

160G    /mnt/hadoop
160G    /mnt/hadoop/dfs/data
74M     /mnt/logs

180G    /mnt/hadoop
122G    /mnt/hadoop/dfs/data
23M     /mnt/logs

139G    /mnt/hadoop
138G    /mnt/hadoop/dfs/data
76M     /mnt/logs



All these numbers are for today, and may differ bit from yesterday.

Today hadoop dfs -dus is 58GB and namenode is reporting DFS Used as 1.46TB.

Pardon me for making the mail dirty by lot of copy-pastes, hope it's still
readable,

-- Saumitra S. Shahapure


On Tue, Apr 15, 2014 at 2:57 AM, Abdelrahman Shettia <
ashet...@hortonworks.com> wrote:

> Hi Biswa,
>
> Are you sure that the replication factor of the files are three? Please
> run a ‘hadoop fsck / -blocks -files -locations’ and see the replication
> factor for each file.  Also, Post the configuration of <name>dfs.datanode.
> du.reserved</name> and please check the real space presented by a
> DataNode by running ‘du -h’
>
> Thanks,
> Rahman
>
> On Apr 14, 2014, at 2:07 PM, Saumitra <saumitra.offic...@gmail.com> wrote:
>
> Hello,
>
> Biswanath, looks like we have confusion in calculation, 1TB would be equal
> to 1024GB, not 114GB.
>
>
> Sandeep, I checked log directory size as well. Log directories are hardly
> in few GBs, I have configured log4j properties so that logs won’t be too
> large.
>
> In our slave machines, we have 450GB disk partition for hadoop logs and
> DFS. Over there logs directory is < 10GBs and rest space is occupied by
> DFS. 10GB partition is for /.
>
> Let me quote my confusion point once again:
>
>  Basically I wanted to point out discrepancy in name node status page and 
> hadoop
>>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later
>>> one reports it to be 35GB. What are the factors that can cause this
>>> difference? And why is just 35GB data causing DFS to hit its limits?
>>>
>>
>
> I am talking about name node status page on 50070 port. Here is the
> screenshot of my name node status page
>
> <Screen Shot 2014-04-15 at 2.07.19 am.png>
>
> As I understand, 'DFS used’ is the space taken by DFS, non-DFS used is
> spaces taken by non-DFS data like logs or other local files from users.
> Namenode shows that DFS used is ~1TB but hadoop dfs -dus shows it to be
> ~38GB.
>
>
>
> On 14-Apr-2014, at 12:33 pm, Sandeep Nemuri <nhsande...@gmail.com> wrote:
>
>  Please check your logs directory usage.
>
>
>
> On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak <
> biswajit.na...@inmobi.com> wrote:
>
>> Whats the replication factor you have? I believe it should be 3. hadoop
>> dus shows that disk usage without replication. While name node ui page
>> gives with replication.
>>
>> 38gb * 3 =114gb ~ 1TB
>>
>> ~Biswa
>> -----oThe important thing is not to stop questioning o-----
>>
>>
>> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <saumitra.offic...@gmail.com>wrote:
>>
>>> Hi Biswajeet,
>>>
>>> Non-dfs usage is ~100GB over the cluster. But still the number are
>>> nowhere near 1TB.
>>>
>>> Basically I wanted to point out discrepancy in name node status page and 
>>> hadoop
>>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later
>>> one reports it to be 35GB. What are the factors that can cause this
>>> difference? And why is just 35GB data causing DFS to hit its limits?
>>>
>>>
>>>
>>>
>>> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <biswajit.na...@inmobi.com>
>>> wrote:
>>>
>>> Hi Saumitra,
>>>
>>> Could you please check the non-dfs usage. They also contribute to
>>> filling up the disk space.
>>>
>>>
>>>
>>> ~Biswa
>>> -----oThe important thing is not to stop questioning o-----
>>>
>>>
>>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra 
>>> <saumitra.offic...@gmail.com>wrote:
>>>
>>>> Hello,
>>>>
>>>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1.
>>>> We are using default HDFS block size.
>>>>
>>>> We have noticed that disks of slaves are almost full. From name node’s
>>>> status page (namenode:50070), we could see that disks of live nodes are 90%
>>>> full and DFS Used% in cluster summary page  is ~1TB.
>>>>
>>>> However hadoop dfs -dus / shows that file system size is merely 38GB.
>>>> 38GB number looks to be correct because we keep only few Hive tables and
>>>> hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data
>>>> is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think
>>>> that there is no internal fragmentation because the files in our Hive
>>>> tables are well-chopped in ~50MB chunks. Here are last few lines of
>>>> hadoop fsck / -files -blocks
>>>>
>>>> Status: HEALTHY
>>>>  Total size: 38086441332 B
>>>>  Total dirs: 232
>>>>  Total files: 802
>>>>  Total blocks (validated): 796 (avg. block size 47847288 B)
>>>>  Minimally replicated blocks: 796 (100.0 %)
>>>>  Over-replicated blocks: 0 (0.0 %)
>>>>  Under-replicated blocks: 6 (0.75376886 %)
>>>>  Mis-replicated blocks: 0 (0.0 %)
>>>>  Default replication factor: 2
>>>>  Average block replication: 3.0439699
>>>>  Corrupt blocks: 0
>>>>  Missing replicas: 6 (0.24762692 %)
>>>>  Number of data-nodes: 9
>>>>  Number of racks: 1
>>>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds
>>>>
>>>>
>>>> My question is that why disks of slaves are getting full even though
>>>> there are only few files in DFS?
>>>>
>>>
>>>
>>> _____________________________________________________________
>>> The information contained in this communication is intended solely for
>>> the use of the individual or entity to whom it is addressed and others
>>> authorized to receive it. It may contain confidential or legally privileged
>>> information. If you are not the intended recipient you are hereby notified
>>> that any disclosure, copying, distribution or taking any action in reliance
>>> on the contents of this information is strictly prohibited and may be
>>> unlawful. If you have received this communication in error, please notify
>>> us immediately by responding to this email and then delete it from your
>>> system. The firm is neither liable for the proper and complete transmission
>>> of the information contained in this communication nor for any delay in its
>>> receipt.
>>>
>>>
>>>
>>
>> _____________________________________________________________
>> The information contained in this communication is intended solely for
>> the use of the individual or entity to whom it is addressed and others
>> authorized to receive it. It may contain confidential or legally privileged
>> information. If you are not the intended recipient you are hereby notified
>> that any disclosure, copying, distribution or taking any action in reliance
>> on the contents of this information is strictly prohibited and may be
>> unlawful. If you have received this communication in error, please notify
>> us immediately by responding to this email and then delete it from your
>> system. The firm is neither liable for the proper and complete transmission
>> of the information contained in this communication nor for any delay in its
>> receipt.
>>
>
>
>
> --
> --Regards
>   Sandeep Nemuri
>
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.

Re: HDFS file system size issue

Reply via email to