Shangshu Qian created HDFS-17780:
------------------------------------
Summary: The retry logic in IncrementalBlockReport may bypass the
configured IBR interval, causing contention on NameNode
Key: HDFS-17780
URL: https://issues.apache.org/jira/browse/HDFS-17780
Project: Hadoop HDFS
Issue Type: Bug
Components: datanode, namenode
Affects Versions: 3.4.1, 2.10.2
Reporter: Shangshu Qian
In the current IncrementalBlockReportManager.sendIBR(), the IBR is retried if
the RPC (blockReceivedAndDeleted) to NN fails.
{code:java}
void sendIBRs(DatanodeProtocol namenode, DatanodeRegistration registration,
String bpid) throws IOException {
// Generate a list of the pending reports for each storage under the lock
final StorageReceivedDeletedBlocks[] reports = generateIBRs();
if (reports.length == 0) {
// Nothing new to report.
return;
} // Send incremental block reports to the Namenode outside the lock
if (LOG.isDebugEnabled()) {
LOG.debug("call blockReceivedAndDeleted: " + Arrays.toString(reports));
}
boolean success = false;
final long startTime = monotonicNow();
try {
namenode.blockReceivedAndDeleted(registration, bpid, reports);
success = true;
} finally { if (success) {
dnMetrics.addIncrementalBlockReport(monotonicNow() - startTime);
lastIBR = startTime;
} else {
// If we didn't succeed in sending the report, put all of the
// blocks back onto our queue, but only in the case where we
// didn't put something newer in the meantime.
putMissing(reports);
}
}
} {code}
The retry does not update the `lastIBR` variable, so the failed IBRs will be
retried. However, this retry bypasses the configured
`dfs.blockreport.incremental.intervalMsec` and will be retied on the next
heartbeat because `lastIBR` is not updated.
If the `blockReceivedAndDeleted` fails due to the high load on the NameNode,
such retry will only make the contention worse, resulting in a feedback loop.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]