Hui Fei created HDFS-15875:
------------------------------

             Summary: Check whether file is being truncated before truncate
                 Key: HDFS-15875
                 URL: https://issues.apache.org/jira/browse/HDFS-15875
             Project: Hadoop HDFS
          Issue Type: Bug
    Affects Versions: 3.2.2, 3.1.4, 3.3.0
            Reporter: Hui Fei
            Assignee: Hui Fei


We have got this problem.
 * A job sends truncate to namenode, and the block recovery goes.
 * DataNode D is timeout while it connects another datanode (60s), so block 
recovery costs 60+s
 * A job tails, and B job starts and it sends truncate to namenode. New 
recoveryId generates during recovery lease.
 * DataNode D commitBlockSynchronization and get errors "does not match current 
recovery id"

So truncate will not complete forever. Datanode D has replica with new length 
and two other datanodes have replica old length. 

DN has the error messages "Inconsistent size of finalized replicas"

the related code is in BlockRecoveryWorker.java

{code}

for (BlockRecord r : syncList) {
 assert r.rInfo.getNumBytes() > 0 : "zero length replica";
 ReplicaState rState = r.rInfo.getOriginalReplicaState();
 if (rState.getValue() < bestState.getValue()) {
 bestState = rState;
 }
 if(rState == ReplicaState.FINALIZED) {
 if (finalizedLength > 0 && finalizedLength != r.rInfo.getNumBytes()) {
 throw new IOException("Inconsistent size of finalized replicas. " +
 "Replica " + r.rInfo + " expected size: " + finalizedLength);
 }
 finalizedLength = r.rInfo.getNumBytes();
 }
}

{code}

 

 
{code:java}
 {code}
 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to