Shuyan Zhang created HDFS-17227:
-----------------------------------

             Summary: EC: Fix bug in choosing targets when racks is not enough.
                 Key: HDFS-17227
                 URL: https://issues.apache.org/jira/browse/HDFS-17227
             Project: Hadoop HDFS
          Issue Type: Bug
            Reporter: Shuyan Zhang


*Bug description*
If,
1. There is a striped block blockinfo1, which has an excess replica on 
datanodeA.
2. blockinfo1 has an internal block that needs to be reconstruction.
3. The number of racks is less than the number of internal blocks of Blockinfo1.
Then, NN may choose datanodeA to reconstruct the internal block, resulting in 
two internal blocks of blockinfo1 on datanodeA, causing confusion. 

*Root cause and solution*
When we use `BlockPlacementPolicyRackFaultTolerant` for choosing targets and 
the racks is insufficient, `chooseEvenlyFromRemainingRacks` will be called. 
Currently, `chooseEvenlyFromRemainingRacks` calls `chooseOnce`, `chooseOnce` 
use `newExcludeNodes` as parameter instead of `excludedNodes`. When we choose 
targets for reconstructing internal blocks, 'newExcludeNodes' only includes 
those datanodes that contain live replicas, and does not include datanodes that 
have excess replicas. This may result in datanodes with excess replicas is 
chosen.
I don't think we need to use 'newExcludeNodes', just pass `excludedNodes` to 
`chooseOnce`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to