Adam Kawa created HDFS-6075:
-------------------------------
Summary: Introducing "non-replication mode"
Key: HDFS-6075
URL: https://issues.apache.org/jira/browse/HDFS-6075
Project: Hadoop HDFS
Issue Type: New Feature
Components: datanode, namenode
Reporter: Adam Kawa
Priority: Minor
Afaik, HDFS does not provide an easy way to temporarily disable the replication
of missing blocks.
If you would like to temporarily disable the replication, you would have to
* set dfs.namenode.replication.interval (_The periodicity in seconds with which
the namenode computes repliaction work for datanodes_ Default 3) to something
very high. *Disadvantage*: you have to restart the NN
* go into the safe-mode. *Disadvantage*: all write operations will fail
We have the situation that we need to replace our top-of-rack switches for each
rack. Replacing a switch should take around 30 minutes. Each rack has around
0.6 PB of data. We would like to avoid an expensive replication, since we know
that we will put this rack online quickly. To avoid any downtime, or excessive
network transfer, we think that temporarily disabling the replication could fit
us.
The default block placement policy puts blocks into two racks, so when one rack
temporarily goes offline, we still have an access to at least replica of each
block. Of course, if we lose this replica, then we would have to wait until the
rack goes back online. This is what the administrator should be aware of.
This feature could disable the replication
* globally - for a whole cluster
* partially - e.g. only for missing blocks that come from a specified set of
DataNodes. So a file like "we_will_be_back_soon" :) could be introduced,
similar to include and exclude.
--
This message was sent by Atlassian JIRA
(v6.2#6252)