@Xiaoqiao He Thanks very much for responsing. 1. Yes, this proposal is related to RBF and ARR features. 2. Usually, slow nameservices will not affect the performance of other normal nameservices due to the async handler thread pool. But there exists an extremely rare situation that retry invoke method (using async repsonder thread here) and target nameserivce has poor performance.
Best Regards, - Zhang Haobo ---- Replied Message ---- | From | Xiaoqiao He<hexiaoq...@apache.org> | | Date | 04/23/2025 20:57 | | To | Steve Loughran<ste...@cloudera.com> | | Cc | Zhanghaobo<hfutzhan...@163.com> , common-dev@hadoop.apache.org<common-dev@hadoop.apache.org> , Hdfs-dev<hdfs-...@hadoop.apache.org> | | Subject | Re: [DISCUSS] Optimize the invoke retry of async repsonder when some nameservices are slow | + hdfs-dev. Thanks Haobo for your proposal. As you mentioned above, this may be related to RBF and ARR features, right? IMO, it is necessary to improve responder performance, but I am a little confused about nameservice will slow down the whole system, The first glance is client or package size should be more affected. Anyway, improving responder performance will be one good point, especially for the Router of HDFS. Thanks again. Best Regards, - He Xiaoqiao On Wed, Apr 23, 2025 at 6:52 PM Steve Loughran <ste...@cloudera.com.invalid> wrote: think you meant to send this to hdfs-dev, not the -subscribe list. On Wed, 16 Apr 2025 at 14:09, Zhanghaobo <hfutzhan...@163.com> wrote: Hello, everyone, so sorry to bother you and I would like to discuss how to optimize the invoke retry logic of async repsonder when some nameservices are slow. Currently, all nameservices share one async responder thread pool which is in class AsyncRpcProtocolPBUtil. Think below situation, we have two nameservices: ns1 and ns2, and the performance of ns2 is very pool. If we hit the invoke retry logic in handlerInvokeException, we will use the thread in async responder thread pool to perform invoke retry. As mentioned before, and the performance of ns2 is very pool. This will cause all threads in the async responder thread pool being occupied by the slow ns2 cluster in extreme situation. This is a critical problem and my idea is to separate async responder thread pool by nameservice as well like async handler thread pool. I would appreciate to hear your thoughts. Best Regards, - Zhang Haobo --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org