Alex, Ouput of $ bin/hadoop fsck / command after running HBase data insert command in a table is:
..... ..... ..... ..... ..... /hbase/test/903188508/tags/info/4897652949308499876: Under replicated blk_-5193 695109439554521_3133. Target Replicas is 3 but found 1 replica(s). . /hbase/test/903188508/tags/mapfiles/4897652949308499876/data: Under replicated blk_-1213602857020415242_3132. Target Replicas is 3 but found 1 replica(s). . /hbase/test/903188508/tags/mapfiles/4897652949308499876/index: Under replicated blk_3934493034551838567_3132. Target Replicas is 3 but found 1 replica(s). . /user/HadoopAdmin/hbase table.doc: Under replicated blk_4339521803948458144_103 1. Target Replicas is 3 but found 2 replica(s). . /user/HadoopAdmin/input/bin.doc: Under replicated blk_-3661765932004150973_1030 . Target Replicas is 3 but found 2 replica(s). . /user/HadoopAdmin/input/file01.txt: Under replicated blk_2744169131466786624_10 01. Target Replicas is 3 but found 2 replica(s). . /user/HadoopAdmin/input/file02.txt: Under replicated blk_2021956984317789924_10 02. Target Replicas is 3 but found 2 replica(s). . /user/HadoopAdmin/input/test.txt: Under replicated blk_-3062256167060082648_100 4. Target Replicas is 3 but found 2 replica(s). ... /user/HadoopAdmin/output/part-00000: Under replicated blk_8908973033976428484_1 010. Target Replicas is 3 but found 2 replica(s). Status: HEALTHY Total size: 48510226 B Total dirs: 492 Total files: 439 (Files currently being written: 2) Total blocks (validated): 401 (avg. block size 120973 B) (Total open file blocks (not validated): 2) Minimally replicated blocks: 401 (100.0 %) Over-replicated blocks: 0 (0.0 %) Under-replicated blocks: 399 (99.50124 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor: 2 Average block replication: 1.3117207 Corrupt blocks: 0 Missing replicas: 675 (128.327 %) Number of data-nodes: 2 Number of racks: 1 The filesystem under path '/' is HEALTHY Please tell what is wrong. Aseem -----Original Message----- From: Alex Loddengaard [mailto:[email protected]] Sent: Friday, April 10, 2009 11:04 PM To: [email protected] Subject: Re: More Replication on dfs Aseem, How are you verifying that blocks are not being replicated? Have you ran fsck? *bin/hadoop fsck /* I'd be surprised if replication really wasn't happening. Can you run fsck and pay attention to "Under-replicated blocks" and "Mis-replicated blocks?" In fact, can you just copy-paste the output of fsck? Alex On Thu, Apr 9, 2009 at 11:23 PM, Puri, Aseem <[email protected]>wrote: > > Hi > I also tried the command $ bin/hadoop balancer. But still the > same problem. > > Aseem > > -----Original Message----- > From: Puri, Aseem [mailto:[email protected]] > Sent: Friday, April 10, 2009 11:18 AM > To: [email protected] > Subject: RE: More Replication on dfs > > Hi Alex, > > Thanks for sharing your knowledge. Till now I have three > machines and I have to check the behavior of Hadoop so I want > replication factor should be 2. I started my Hadoop server with > replication factor 3. After that I upload 3 files to implement word > count program. But as my all files are stored on one machine and > replicated to other datanodes also, so my map reduce program takes input > from one Datanode only. I want my files to be on different data node so > to check functionality of map reduce properly. > > Also before starting my Hadoop server again with replication > factor 2 I formatted all Datanodes and deleted all old data manually. > > Please suggest what I should do now. > > Regards, > Aseem Puri > > > -----Original Message----- > From: Mithila Nagendra [mailto:[email protected]] > Sent: Friday, April 10, 2009 10:56 AM > To: [email protected] > Subject: Re: More Replication on dfs > > To add to the question, how does one decide what is the optimal > replication > factor for a cluster. For instance what would be the appropriate > replication > factor for a cluster consisting of 5 nodes. > Mithila > > On Fri, Apr 10, 2009 at 8:20 AM, Alex Loddengaard <[email protected]> > wrote: > > > Did you load any files when replication was set to 3? If so, you'll > have > > to > > rebalance: > > > > > <http://hadoop.apache.org/core/docs/r0.19.1/commands_manual.html#balance > r> > > < > > > http://hadoop.apache.org/core/docs/r0.19.1/hdfs_user_guide.html#Rebalanc > er > > > > > > > Note that most people run HDFS with a replication factor of 3. There > have > > been cases when clusters running with a replication of 2 discovered > new > > bugs, because replication is so often set to 3. That said, if you can > do > > it, it's probably advisable to run with a replication factor of 3 > instead > > of > > 2. > > > > Alex > > > > On Thu, Apr 9, 2009 at 9:56 PM, Puri, Aseem <[email protected] > > >wrote: > > > > > Hi > > > > > > I am a new Hadoop user. I have a small cluster with 3 > > > Datanodes. In hadoop-site.xml values of dfs.replication property is > 2 > > > but then also it is replicating data on 3 machines. > > > > > > > > > > > > Please tell why is it happening? > > > > > > > > > > > > Regards, > > > > > > Aseem Puri > > > > > > > > > > > > > > > > > > > > >
