Hello,

I'm trying HDFS on a small test cluster and I need to clarify some doubts about hadoop behaviour.

Some details of my cluster:
Hadoop version: 0.20.2
I have two racks (rack1, rack2). Three datanodes for every rack.
Replication factor is set to 3.

"HDFS’s placement policy is to put one replica on one node in the local rack, another on a node in a different (remote) rack, and the last on a different node in the same remote rack." Instead, I noticed that sometimes, a few blocks of files are stored as follows: two replicas in the local rack and a replica in a different rack. Are there exceptions that cause different behaviour than default placement policy? Likewise, at times some blocks are read from nodes in the remote rack instead of nodes in the local rack. Why does it happen?

Another thing:if I have two datacenters and two racks for each of them (so a hierarchical network topology), where tworemote replicas arestored? Does Hadoop consider the hierarchy and stores one replica in the local datacenter and two replicas in the other datacenter? Or the two replicas are stored in a totally random rack?

Thanks
Gianni

Reply via email to