Uma Maheswara Rao G created HDFS-10285:
------------------------------------------
Summary: Storage Policy Satisfier in Namenode
Key: HDFS-10285
URL: https://issues.apache.org/jira/browse/HDFS-10285
Project: Hadoop HDFS
Issue Type: New Feature
Components: datanode, namenode
Affects Versions: 2.7.2
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G
Heterogeneous storage in HDFS introduced the concept of storage policy. These
policies can be set on directory/file to specify the user preference, where to
store the physical block. When user set the storage policy before writing data,
then the blocks could take advantage of storage policy preferences and stores
physical block accordingly.
If user set the storage policy after writing and completing the file, then the
blocks would have been written with default storage policy (nothing but DISK).
User has to run the ‘Mover tool’ explicitly by specifying all such file names
as a list. In some distributed system scenarios (ex: HBase) it would be
difficult to collect all the files and run the tool as different nodes can
write files separately and file can have different paths.
Another scenarios is, when user rename the files from one effected storage
policy file (inherited policy from parent directory) to another storage policy
effected directory, it will not copy inherited storage policy from source. So
it will take effect from destination file/dir parent storage policy. This
rename operation is just a metadata change in Namenode. The physical blocks
still remain with source storage policy.
So, Tracking all such business logic based file names could be difficult for
admins from distributed nodes(ex: region servers) and running the Mover tool.
Here the proposal is to provide an API from Namenode itself for trigger the
storage policy satisfaction. A Daemon thread inside Namenode should track such
calls and process to DN as movement commands.
Will post the detailed design thoughts document soon.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)