Hi, When allocate a datanode when dfsclient write with considering the load, it checks if the datanode is overloaded by calculating the average xceivers of all the in service datanode. But if the datanode is decommissioned and become dead, it's still treated as in service, which make the average load much more than the real one especially when the number of the decommissioned datanode is great. In our cluster, 180 datanode, and 100 of them decommissioned, and the average load is 17. This failed all the datanode allocation.
I have created a jira HDFS-12820. But not very sure the solution. My concern is why we don't exclude the decommissioned datanode from the nodesInService originally? If we simply remove this, could there be any side effect? private void subtract(final DatanodeDescriptor node) { capacityUsed -= node.getDfsUsed(); blockPoolUsed -= node.getBlockPoolUsed(); xceiverCount -= node.getXceiverCount(); if (!(node.isDecommissionInProgress() || node.isDecommissioned())) { nodesInService--; nodesInServiceXceiverCount -= node.getXceiverCount(); capacityTotal -= node.getCapacity(); capacityRemaining -= node.getRemaining(); } else { capacityTotal -= node.getDfsUsed(); } cacheCapacity -= node.getCacheCapacity(); cacheUsed -= node.getCacheUsed(); } -- Xie Gang