Fengnan Li created HDFS-14914: --------------------------------- Summary: Observer should throw StandbyException in Safemode Key: HDFS-14914 URL: https://issues.apache.org/jira/browse/HDFS-14914 Project: Hadoop HDFS Issue Type: Improvement Reporter: Fengnan Li Assignee: Fengnan Li Attachments: HDFS-14914-001.patch
When observer is in safemode, calling getBlockLocations will make it throw RetriableException as inĀ [HDFS-13898|https://issues.apache.org/jira/browse/HDFS-13898]. However, during startup the safemode is taking a really long time and retry would not help much here. What makes it worse is that when Routers talking to Observers, since Router distinguishes StandbyException and RetriableException, it will keep retry (default 3) times and then return to the client an RetriableException. The client will retry again on the same Router and to the same Observer for default 10 times, resulting in 3 * 10 = 30 retries per call. The change is to make it failover so that Router can immediately try another Observer or Active namenode (depends on the design). The current ObserverReadProxyProvider doesn't get affected since both RetriableException and StandbyException will make it failover. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org