Hi
My problem is not that my data is under replicated. I have 3
data nodes. In my hadoop-site.xml I also set the configuration as:
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
But after this also data is replicated on 3 nodes instead of two nodes.
Now, please tell what can be the problem?
Thanks & Regards
Aseem Puri
-----Original Message-----
From: Raghu Angadi [mailto:[email protected]]
Sent: Wednesday, April 15, 2009 2:58 AM
To: [email protected]
Subject: Re: More Replication on dfs
Aseem,
Regd over-replication, it is mostly app related issue as Alex mentioned.
But if you are concerned about under-replicated blocks in fsck output :
These blocks should not stay under-replicated if you have enough nodes
and enough space on them (check NameNode webui).
Try grep-ing for one of the blocks in NameNode log (and datnode logs as
well, since you have just 3 nodes).
Raghu.
Puri, Aseem wrote:
> Alex,
>
> Ouput of $ bin/hadoop fsck / command after running HBase data insert
> command in a table is:
>
> .....
> .....
> .....
> .....
> .....
> /hbase/test/903188508/tags/info/4897652949308499876: Under replicated
> blk_-5193
> 695109439554521_3133. Target Replicas is 3 but found 1 replica(s).
> .
> /hbase/test/903188508/tags/mapfiles/4897652949308499876/data: Under
> replicated
> blk_-1213602857020415242_3132. Target Replicas is 3 but found 1
> replica(s).
> .
> /hbase/test/903188508/tags/mapfiles/4897652949308499876/index: Under
> replicated
> blk_3934493034551838567_3132. Target Replicas is 3 but found 1
> replica(s).
> .
> /user/HadoopAdmin/hbase table.doc: Under replicated
> blk_4339521803948458144_103
> 1. Target Replicas is 3 but found 2 replica(s).
> .
> /user/HadoopAdmin/input/bin.doc: Under replicated
> blk_-3661765932004150973_1030
> . Target Replicas is 3 but found 2 replica(s).
> .
> /user/HadoopAdmin/input/file01.txt: Under replicated
> blk_2744169131466786624_10
> 01. Target Replicas is 3 but found 2 replica(s).
> .
> /user/HadoopAdmin/input/file02.txt: Under replicated
> blk_2021956984317789924_10
> 02. Target Replicas is 3 but found 2 replica(s).
> .
> /user/HadoopAdmin/input/test.txt: Under replicated
> blk_-3062256167060082648_100
> 4. Target Replicas is 3 but found 2 replica(s).
> ...
> /user/HadoopAdmin/output/part-00000: Under replicated
> blk_8908973033976428484_1
> 010. Target Replicas is 3 but found 2 replica(s).
> Status: HEALTHY
> Total size: 48510226 B
> Total dirs: 492
> Total files: 439 (Files currently being written: 2)
> Total blocks (validated): 401 (avg. block size 120973 B) (Total
> open file
> blocks (not validated): 2)
> Minimally replicated blocks: 401 (100.0 %)
> Over-replicated blocks: 0 (0.0 %)
> Under-replicated blocks: 399 (99.50124 %)
> Mis-replicated blocks: 0 (0.0 %)
> Default replication factor: 2
> Average block replication: 1.3117207
> Corrupt blocks: 0
> Missing replicas: 675 (128.327 %)
> Number of data-nodes: 2
> Number of racks: 1
>
>
> The filesystem under path '/' is HEALTHY
> Please tell what is wrong.
>
> Aseem
>
> -----Original Message-----
> From: Alex Loddengaard [mailto:[email protected]]
> Sent: Friday, April 10, 2009 11:04 PM
> To: [email protected]
> Subject: Re: More Replication on dfs
>
> Aseem,
>
> How are you verifying that blocks are not being replicated? Have you
> ran
> fsck? *bin/hadoop fsck /*
>
> I'd be surprised if replication really wasn't happening. Can you run
> fsck
> and pay attention to "Under-replicated blocks" and "Mis-replicated
> blocks?"
> In fact, can you just copy-paste the output of fsck?
>
> Alex
>
> On Thu, Apr 9, 2009 at 11:23 PM, Puri, Aseem
> <[email protected]>wrote:
>
>> Hi
>> I also tried the command $ bin/hadoop balancer. But still the
>> same problem.
>>
>> Aseem
>>
>> -----Original Message-----
>> From: Puri, Aseem [mailto:[email protected]]
>> Sent: Friday, April 10, 2009 11:18 AM
>> To: [email protected]
>> Subject: RE: More Replication on dfs
>>
>> Hi Alex,
>>
>> Thanks for sharing your knowledge. Till now I have three
>> machines and I have to check the behavior of Hadoop so I want
>> replication factor should be 2. I started my Hadoop server with
>> replication factor 3. After that I upload 3 files to implement word
>> count program. But as my all files are stored on one machine and
>> replicated to other datanodes also, so my map reduce program takes
> input
>> from one Datanode only. I want my files to be on different data node
> so
>> to check functionality of map reduce properly.
>>
>> Also before starting my Hadoop server again with replication
>> factor 2 I formatted all Datanodes and deleted all old data manually.
>>
>> Please suggest what I should do now.
>>
>> Regards,
>> Aseem Puri
>>
>>
>> -----Original Message-----
>> From: Mithila Nagendra [mailto:[email protected]]
>> Sent: Friday, April 10, 2009 10:56 AM
>> To: [email protected]
>> Subject: Re: More Replication on dfs
>>
>> To add to the question, how does one decide what is the optimal
>> replication
>> factor for a cluster. For instance what would be the appropriate
>> replication
>> factor for a cluster consisting of 5 nodes.
>> Mithila
>>
>> On Fri, Apr 10, 2009 at 8:20 AM, Alex Loddengaard <[email protected]>
>> wrote:
>>
>>> Did you load any files when replication was set to 3? If so, you'll
>> have
>>> to
>>> rebalance:
>>>
>>>
>
<http://hadoop.apache.org/core/docs/r0.19.1/commands_manual.html#balance
>> r>
>>> <
>>>
>
http://hadoop.apache.org/core/docs/r0.19.1/hdfs_user_guide.html#Rebalanc
>> er
>>> Note that most people run HDFS with a replication factor of 3.
> There
>> have
>>> been cases when clusters running with a replication of 2 discovered
>> new
>>> bugs, because replication is so often set to 3. That said, if you
> can
>> do
>>> it, it's probably advisable to run with a replication factor of 3
>> instead
>>> of
>>> 2.
>>>
>>> Alex
>>>
>>> On Thu, Apr 9, 2009 at 9:56 PM, Puri, Aseem
> <[email protected]
>>>> wrote:
>>>> Hi
>>>>
>>>> I am a new Hadoop user. I have a small cluster with 3
>>>> Datanodes. In hadoop-site.xml values of dfs.replication property
> is
>> 2
>>>> but then also it is replicating data on 3 machines.
>>>>
>>>>
>>>>
>>>> Please tell why is it happening?
>>>>
>>>>
>>>>
>>>> Regards,
>>>>
>>>> Aseem Puri
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>