bhji123 created HDFS-15419:
------------------------------
Summary: router retry with configurable time interval when cluster
is unavailable
Key: HDFS-15419
URL: https://issues.apache.org/jira/browse/HDFS-15419
Project: Hadoop HDFS
Issue Type: Improvement
Components: configuration, hdfs-client, rbf
Reporter: bhji123
When cluster is unavailable, router -> namenode communication will only retry
once without any time interval, that is not reasonable.
For example, in my company, which has several hdfs clusters with more than 1000
nodes, we have encountered this problem. In some cases, the cluster becomes
unavailable briefly for about 10 or 30 seconds, at the same time, almost all
rpc requests to router failed because router only retry once without time
interval.
It's better for us to enhance the router retry strategy, to retry with
configurable time interval and max retry times.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]