André Frimberger created HDFS-11311:
---------------------------------------

             Summary: HDFS fsck continues to report all blocks present when 
DataNode is restarted with empty data directories
                 Key: HDFS-11311
                 URL: https://issues.apache.org/jira/browse/HDFS-11311
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: namenode
    Affects Versions: 3.0.0-alpha1, 2.7.3
            Reporter: André Frimberger


During cluster maintenance, we had to change parameters of the underlying disk 
filesystem and we stopped the DataNode, reformatted all of its data directories 
and started the DataNode again in under 10 minutes with no data and only the 
{{VERSION}} file present. Running fsck afterwards reports that all blocks are 
fully replicated, which does not reflect the true state of HDFS. If an 
administrator trusts {{fsck}} and continues to replace further DataNodes, *data 
will be lost!*

Steps to reproduce:
1. Shutdown DataNode
2. Remove all BlockPools from all data directories (only {{VERSION}} file is 
present)
3. Startup DataNode in under 10.5 minutes
4. Run {{hdfs fsck /}}

*Actual result:* Average replication is falsely shown as 3.0
*Expected result:* Average replication factor is < 3.0

*Workaround:* Trigger a block report with {{hdfs dfsadmin -triggerBlockReport 
$dn_host:$ipc_port}}

*Cause:* The first block report is handled differently by NameNode and only 
added blocks are respected. This behaviour was introduced in HDFS-7980 for 
performance reasons. But is applied too widely and in our case data can be lost.

*Fix:* We suggest using stricter conditions on applying 
{{processFirstBlockReport}} in {{BlockManager:processReport()}}:
Change
{code}
if (storageInfo.getBlockReportCount() == 0) {


    // The first block report can be processed a lot more efficiently than
    // ordinary block reports.  This shortens restart times.
    processFirstBlockReport(storageInfo, newReport);
} else {
    invalidatedBlocks = processReport(storageInfo, newReport);
}
{code}

to

{code}
if (storageInfo.getBlockReportCount() == 0 && storageInfo.getState() != 
State.FAILED && storageInfo.numBlocks() > 0) {


    // The first block report can be processed a lot more efficiently than
    // ordinary block reports.  This shortens restart times.
    processFirstBlockReport(storageInfo, newReport);
} else {
    invalidatedBlocks = processReport(storageInfo, newReport);
}
{code}

In case the DataNode reports no blocks for a data directory, it might be a new 
DataNode or the data directory may have been emptied for whatever reason 
(offline replacement of storage, reformatting of data disk, etc.). In either 
case, the changes should be reflected in the output of {{fsck}} in less than 6 
hours to prevent data loss due to misleading output.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to