Zhe Zhang created HDFS-10967:
--------------------------------
Summary: Add configuration for BlockPlacementPolicy to
deprioritize near-full DataNodes
Key: HDFS-10967
URL: https://issues.apache.org/jira/browse/HDFS-10967
Project: Hadoop HDFS
Issue Type: Improvement
Components: namenode
Reporter: Zhe Zhang
Large production clusters are likely to have heterogeneous nodes in terms of
storage capacity, memory, and CPU cores. It is not always possible to
proportionally ingest data into DataNodes based on their remaining storage
capacity. Therefore it's possible for a subset of DataNodes to be much closer
to full capacity than the rest.
Notice that this heterogeneity is most likely rack-by-rack -- i.e. _m_ whole
racks with low-storage nodes and _n_ whole racks with high-storage nodes. So
It'd be very useful if we can deprioritize those near-full DataNodes as
destinations for the 2nd and 3rd replicas.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]