Jing Zhao created HDFS-10858:
--------------------------------
Summary: FBR processing may generate incorrect
reportedBlock-blockGroup mapping
Key: HDFS-10858
URL: https://issues.apache.org/jira/browse/HDFS-10858
Project: Hadoop HDFS
Issue Type: Sub-task
Components: erasure-coding
Affects Versions: 3.0.0-alpha1
Reporter: Jing Zhao
Assignee: Jing Zhao
Priority: Blocker
In BlockManager#reportDiffSorted:
{code}
} else if (reportedState == ReplicaState.FINALIZED &&
(storedBlock.findStorageInfo(storageInfo) == -1 ||
corruptReplicas.isReplicaCorrupt(storedBlock, dn))) {
// Add replica if appropriate. If the replica was previously corrupt
// but now okay, it might need to be updated.
toAdd.add(new BlockInfoToAdd(storedBlock, replica));
}
{code}
"new BlockInfoToAdd(storedBlock, replica)" is wrong because "replica" (i.e.,
the reported block) is a reused object provided by BlockListAsLongs#iterator.
Later this object is reused by directly changing its ID/GS. Thus
{{addStoredBlock}} can get wrong (reportedBlock, stored-BlockInfo) mapping. For
EC the reported block is used to calculate the internal block index. Thus the
bug can completely corrupt the EC block group internal states.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]