Hi Rahman, These are few lines from hadoop fsck / -blocks -files -locations
/mnt/hadoop/hive/warehouse/user.db/table1/000255_0 44323326 bytes, 1 block(s): OK 0. blk_-7919979022650423857_446500 len=44323326 repl=3 [ip1:50010, ip2:50010, ip3:50010] /mnt/hadoop/hive/warehouse/user.db/table1/000256_0 44566965 bytes, 1 block(s): OK 0. blk_-5768999994812882540_446288 len=44566965 repl=3 [ip1:50010, ip2:50010, ip4:50010] Biswa may have guessed replication factor from fsck summary that I posted earlier. I am posting it again for today's run: Status: HEALTHY Total size: 58143055251 B Total dirs: 307 Total files: 5093 Total blocks (validated): 3903 (avg. block size 14897016 B) Minimally replicated blocks: 3903 (100.0 %) Over-replicated blocks: 0 (0.0 %) Under-replicated blocks: 92 (2.357161 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor: 2 Average block replication: 3.1401486 Corrupt blocks: 0 Missing replicas: 92 (0.75065273 %) Number of data-nodes: 9 Number of racks: 1 FSCK ended at Tue Apr 15 13:20:25 UTC 2014 in 655 milliseconds The filesystem under path '/' is HEALTHY I have not overridden dfs.datanode.du.reserved. It defaults to 0. $ less $HADOOP_HOME/conf/hdfs-site.xml |grep -A3 'dfs.datanode.du.reserved' $ less $HADOOP_HOME/src/hdfs/hdfs-default.xml |grep -A3 'dfs.datanode.du.reserved' <name>dfs.datanode.du.reserved</name> <value>0</value> <description>Reserved space in bytes per volume. Always leave this much space free for non dfs use. </description> Below is du -h on every node. FYI, my dfs.data.dir is /mnt/hadoop/dfs/data and all hadoop/hive logs are dumped in /mnt/logs in various directories. All machines have 400GB for /mnt. $for i in `echo $dfs_slaves`; do ssh $i 'du -sh /mnt/hadoop; du -sh /mnt/hadoop/dfs/data; du -sh /mnt/logs;'; done 225G /mnt/hadoop 224G /mnt/hadoop/dfs/data 61M /mnt/logs 281G /mnt/hadoop 281G /mnt/hadoop/dfs/data 63M /mnt/logs 139G /mnt/hadoop 139G /mnt/hadoop/dfs/data 68M /mnt/logs 135G /mnt/hadoop 134G /mnt/hadoop/dfs/data 92M /mnt/logs 165G /mnt/hadoop 164G /mnt/hadoop/dfs/data 75M /mnt/logs 137G /mnt/hadoop 137G /mnt/hadoop/dfs/data 95M /mnt/logs 160G /mnt/hadoop 160G /mnt/hadoop/dfs/data 74M /mnt/logs 180G /mnt/hadoop 122G /mnt/hadoop/dfs/data 23M /mnt/logs 139G /mnt/hadoop 138G /mnt/hadoop/dfs/data 76M /mnt/logs All these numbers are for today, and may differ bit from yesterday. Today hadoop dfs -dus is 58GB and namenode is reporting DFS Used as 1.46TB. Pardon me for making the mail dirty by lot of copy-pastes, hope it's still readable, -- Saumitra S. Shahapure On Tue, Apr 15, 2014 at 2:57 AM, Abdelrahman Shettia < ashet...@hortonworks.com> wrote: > Hi Biswa, > > Are you sure that the replication factor of the files are three? Please > run a ‘hadoop fsck / -blocks -files -locations’ and see the replication > factor for each file. Also, Post the configuration of <name>dfs.datanode. > du.reserved</name> and please check the real space presented by a > DataNode by running ‘du -h’ > > Thanks, > Rahman > > On Apr 14, 2014, at 2:07 PM, Saumitra <saumitra.offic...@gmail.com> wrote: > > Hello, > > Biswanath, looks like we have confusion in calculation, 1TB would be equal > to 1024GB, not 114GB. > > > Sandeep, I checked log directory size as well. Log directories are hardly > in few GBs, I have configured log4j properties so that logs won’t be too > large. > > In our slave machines, we have 450GB disk partition for hadoop logs and > DFS. Over there logs directory is < 10GBs and rest space is occupied by > DFS. 10GB partition is for /. > > Let me quote my confusion point once again: > > Basically I wanted to point out discrepancy in name node status page and > hadoop >>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later >>> one reports it to be 35GB. What are the factors that can cause this >>> difference? And why is just 35GB data causing DFS to hit its limits? >>> >> > > I am talking about name node status page on 50070 port. Here is the > screenshot of my name node status page > > <Screen Shot 2014-04-15 at 2.07.19 am.png> > > As I understand, 'DFS used’ is the space taken by DFS, non-DFS used is > spaces taken by non-DFS data like logs or other local files from users. > Namenode shows that DFS used is ~1TB but hadoop dfs -dus shows it to be > ~38GB. > > > > On 14-Apr-2014, at 12:33 pm, Sandeep Nemuri <nhsande...@gmail.com> wrote: > > Please check your logs directory usage. > > > > On Mon, Apr 14, 2014 at 12:08 PM, Biswajit Nayak < > biswajit.na...@inmobi.com> wrote: > >> Whats the replication factor you have? I believe it should be 3. hadoop >> dus shows that disk usage without replication. While name node ui page >> gives with replication. >> >> 38gb * 3 =114gb ~ 1TB >> >> ~Biswa >> -----oThe important thing is not to stop questioning o----- >> >> >> On Mon, Apr 14, 2014 at 9:38 AM, Saumitra <saumitra.offic...@gmail.com>wrote: >> >>> Hi Biswajeet, >>> >>> Non-dfs usage is ~100GB over the cluster. But still the number are >>> nowhere near 1TB. >>> >>> Basically I wanted to point out discrepancy in name node status page and >>> hadoop >>> dfs -dus. In my case, earlier one reports DFS usage as 1TB and later >>> one reports it to be 35GB. What are the factors that can cause this >>> difference? And why is just 35GB data causing DFS to hit its limits? >>> >>> >>> >>> >>> On 14-Apr-2014, at 8:31 am, Biswajit Nayak <biswajit.na...@inmobi.com> >>> wrote: >>> >>> Hi Saumitra, >>> >>> Could you please check the non-dfs usage. They also contribute to >>> filling up the disk space. >>> >>> >>> >>> ~Biswa >>> -----oThe important thing is not to stop questioning o----- >>> >>> >>> On Mon, Apr 14, 2014 at 1:24 AM, Saumitra >>> <saumitra.offic...@gmail.com>wrote: >>> >>>> Hello, >>>> >>>> We are running HDFS on 9-node hadoop cluster, hadoop version is 1.2.1. >>>> We are using default HDFS block size. >>>> >>>> We have noticed that disks of slaves are almost full. From name node’s >>>> status page (namenode:50070), we could see that disks of live nodes are 90% >>>> full and DFS Used% in cluster summary page is ~1TB. >>>> >>>> However hadoop dfs -dus / shows that file system size is merely 38GB. >>>> 38GB number looks to be correct because we keep only few Hive tables and >>>> hadoop’s /tmp (distributed cache and job outputs) in HDFS. All other data >>>> is cleaned up. I cross-checked this from hadoop dfs -ls. Also I think >>>> that there is no internal fragmentation because the files in our Hive >>>> tables are well-chopped in ~50MB chunks. Here are last few lines of >>>> hadoop fsck / -files -blocks >>>> >>>> Status: HEALTHY >>>> Total size: 38086441332 B >>>> Total dirs: 232 >>>> Total files: 802 >>>> Total blocks (validated): 796 (avg. block size 47847288 B) >>>> Minimally replicated blocks: 796 (100.0 %) >>>> Over-replicated blocks: 0 (0.0 %) >>>> Under-replicated blocks: 6 (0.75376886 %) >>>> Mis-replicated blocks: 0 (0.0 %) >>>> Default replication factor: 2 >>>> Average block replication: 3.0439699 >>>> Corrupt blocks: 0 >>>> Missing replicas: 6 (0.24762692 %) >>>> Number of data-nodes: 9 >>>> Number of racks: 1 >>>> FSCK ended at Sun Apr 13 19:49:23 UTC 2014 in 135 milliseconds >>>> >>>> >>>> My question is that why disks of slaves are getting full even though >>>> there are only few files in DFS? >>>> >>> >>> >>> _____________________________________________________________ >>> The information contained in this communication is intended solely for >>> the use of the individual or entity to whom it is addressed and others >>> authorized to receive it. It may contain confidential or legally privileged >>> information. If you are not the intended recipient you are hereby notified >>> that any disclosure, copying, distribution or taking any action in reliance >>> on the contents of this information is strictly prohibited and may be >>> unlawful. If you have received this communication in error, please notify >>> us immediately by responding to this email and then delete it from your >>> system. The firm is neither liable for the proper and complete transmission >>> of the information contained in this communication nor for any delay in its >>> receipt. >>> >>> >>> >> >> _____________________________________________________________ >> The information contained in this communication is intended solely for >> the use of the individual or entity to whom it is addressed and others >> authorized to receive it. It may contain confidential or legally privileged >> information. If you are not the intended recipient you are hereby notified >> that any disclosure, copying, distribution or taking any action in reliance >> on the contents of this information is strictly prohibited and may be >> unlawful. If you have received this communication in error, please notify >> us immediately by responding to this email and then delete it from your >> system. The firm is neither liable for the proper and complete transmission >> of the information contained in this communication nor for any delay in its >> receipt. >> > > > > -- > --Regards > Sandeep Nemuri > > > > > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity > to which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under applicable law. If the reader > of this message is not the intended recipient, you are hereby notified that > any printing, copying, dissemination, distribution, disclosure or > forwarding of this communication is strictly prohibited. If you have > received this communication in error, please contact the sender immediately > and delete it from your system. Thank You.