Yes, it will be kept on the machine you issue the "dfs -put" command if it's got a datanode running. Otherwise, a random datanode will be chosen to store the datablocks.
On Fri, Jun 19, 2009 at 10:41 AM, Rajeev Gupta <[email protected]> wrote: > "If you're inserting > into HDFS from a machine running a DataNode, the local datanode will always > be chosen as one of the three replica targets." > Does that mean that if replication factor is 1, whole file will be kept on > one node only? > > Thanks and regards. > -Rajeev Gupta > > > > > Aaron Kimball > <[email protected] > om> To > [email protected] > 06/19/2009 01:56 cc > AM > Subject > Re: HDFS is not loading evenly > Please respond to across all nodes. > core-u...@hadoop. > apache.org > > > > > > > > > Did you run the dfs put commands from the master node? If you're inserting > into HDFS from a machine running a DataNode, the local datanode will always > be chosen as one of the three replica targets. For more balanced loading, > you should use an off-cluster machine as the point of origin. > > If you experience uneven block distribution, you should also periodically > rebalance your cluster by running bin/start-balancer.sh every so often. It > will work in the background to move blocks from heavily-laden nodes to > underutilized ones. > > - Aaron > > On Thu, Jun 18, 2009 at 12:57 PM, openresearch < > [email protected]> wrote: > > > > > Hi all > > > > I "dfs put" a large dataset onto a 10-node cluster. > > > > When I observe the Hadoop progress (via web:50070) and each local file > > system (via df -k), > > I notice that my master node is hit 5-10 times harder than others, so > hard > > drive is get full quicker than others. Last night load, it actually crash > > when hard drive was full. > > > > To my understand, data should wrap around all nodes evenly (in a > > round-robin fashion using 64M as a unit). > > > > Is it expected behavior of Hadoop? Can anyone suggest a good > > troubleshooting > > way? > > > > Thanks > > > > > > -- > > View this message in context: > > > > http://www.nabble.com/HDFS-is-not-loading-evenly-across-all-nodes.-tp24099585p24099585.html > > > Sent from the Hadoop core-user mailing list archive at Nabble.com. > > > > > > >
