Hi Jian Zhang, Thanks for your great work. Please fix the conflict first, others make sense to me. I will give my +1 once it is ready. Another thing, before check in we need to launch another official vote thread. Good luck.
BTW, Happy lunar new year! Best Regards, - He Xiaoqiao On Thu, Feb 6, 2025 at 5:30 PM jian zhang <keeprom...@apache.org> wrote: > Hi, all, > Currently this feature has been developed and passed the pipeline. Please > continue to help review this feature. > > Best Regards, > Jian Zhang > > Zhanghaobo <hfutzhan...@163.com> 于2025年1月22日周三 18:22写道: > >> @Hui Fei Hi, Sir: >> For the first opinion, I have create an umbrella JIRA >> https://issues.apache.org/jira/browse/HDFS-17716 >> and move non-core JIRA under it. >> >> Best Wishes >> Haobo Zhang >> >> ---- Replied Message ---- >> From Hui Fei<feihui.u...@gmail.com> <feihui.u...@gmail.com> >> Date 01/22/2025 17:37 >> To jian zhang<keeprom...@apache.org> <keeprom...@apache.org> >> Cc Hdfs-dev<hdfs-dev@hadoop.apache.org> , >> <hdfs-dev@hadoop.apache.org> <priv...@hadoop.apache.org> , >> <priv...@hadoop.apache.org> Xiaoqiao He<hexiaoq...@apache.org> , >> <hexiaoq...@apache.org> <common-...@hadoop.apache.org> >> <common-...@hadoop.apache.org> >> Subject Re: [DISCUSS] Request to merge branch HDFS-17531 into trunk. >> Got your idea. Thank you! >> - How about removing unfinished tasks and placing them under a new task as >> subtasks, like ARR improvements? If this feature is completed but there >> are >> still some open tasks, it looks strange. >> - Will it take a long time to add documentation? Discussion may last for >> several days. If it takes a long time, I think it may block the trunk >> release and All community members need to remember that there is >> documentation required. It doesn't look good. That's my thought, and we >> can >> wait for others' opinions >> >> jian zhang <keeprom...@apache.org> 于2025年1月22日周三 16:53写道: >> >> Hi, Hui Fei, >> - The remaining 3 sub tasks are not related to the core functions of >> the asynchronous router, and these sub tasks have little impact on the >> trunk branch, we can wait until HDFS-17531 is merged into the trunk, and >> then submit the remaining PRs directly to the trunk. >> - It is indeed necessary to add a documentation to >> "HDFSRouterFederation.md", how about submitting a PR to do this after >> merging HDFS-17531 into the trunk branch? >> >> Best Regards, >> Jian Zhang >> >> Hui Fei <feihui.u...@gmail.com> 于2025年1月22日周三 16:24写道: >> >> Thanks for your great work, looking forward to this feature. >> >> Some comments from me. >> - I checked and found that there are still 3 sub tasks under this >> feature jira ticket, are they necessary to be solved? >> - I didn't find the documentation for this feature. It's a key feature, >> Is it necessary to add documentation to HDFSRouterFederation.md? >> >> jian zhang <zjkeeprom...@gmail.com> 于2025年1月22日周三 10:29写道: >> >> Hi, all, the development of the asynchronous router functionality has >> been completed. The development branch is HDFS-17531, and it is ready to >> be >> merged into the trunk branch. >> >> JIRA: HDFS-17531 https://issues.apache.org/jira/browse/HDFS-17531 >> PR: https://github.com/apache/hadoop/pull/7308 >> >> Here is the functionality introduction of the asynchronous router for >> everyone to review: >> I. Overview >> >> The asynchronous router aims to address the performance bottleneck >> issues of the synchronous router in high - concurrency and multi - >> nameservices scenarios. By introducing an asynchronous processing >> mechanism, it optimizes the request handling process, improves the >> system's >> concurrency ability and resource utilization, and is particularly suitable >> for the federated scenarios where multiple downstream services (NS) need >> to >> be processed. >> >> II. Problems of the Synchronous Router >> >> - Performance Bottleneck: The performance of the synchronous router >> is limited by the number of handler threads. Even if the connection thread >> can still forward requests to the downstream namenode, the handler must >> wait for each request to complete before processing the next one, >> resulting >> in limited processing capacity. >> - Thread Resource Waste: To improve performance, increasing the >> number of handler threads will lead to more thread switches, which instead >> reduces the system efficiency. At the same time, a large number of handler >> threads are in a blocked state, wasting thread resources. >> - Poor Isolation in Multi - ns: If the performance of a certain >> nameservice in the downstream nameservice is poor, it will cause the >> handler to wait for a long time, thus affecting the forwarding of requests >> to other normal - performance ns, resulting in a decrease in the overall >> performance of the downstream ns services perceived by the client. >> - Ineffective Utilization of Federation Multi - ns Performance: In >> high - concurrency scenarios, a large number of requests may be backlogged >> in the router's request queue, while the queues of downstream services are >> not fully utilized, leading to unreasonable resource allocation. >> >> III. Design and Improvements of the Asynchronous Router >> >> The asynchronous router solves the above problems by redesigning the >> request handling process and introducing an asynchronous processing >> mechanism. Its core improvements include: >> >> - Handler: Retrieves requests from the request queue for preliminary >> processing. If there are exceptions in the request (such as the mount >> point >> does not exist, etc.), it directly puts the response into the response >> queue; otherwise, it sends the request to the asynchronous handler thread >> pool. >> - Async Handler: Puts the request into the call queue >> (connection.calls) of the connection thread and returns immediately >> without >> blocking and waiting. >> - Async Responder: Is responsible for processing the responses >> received by the connection thread. If the request needs to be re - >> initiated (such as the downstream service returns a standby exception), it >> re - adds the request to the asynchronous handler thread pool; otherwise, >> it puts the response into the response queue. >> - Responder: Retrieves the response from the response queue and >> returns it to the client. >> >> IV. Advantages of the Asynchronous Router >> >> - High - Concurrency Performance: Through the asynchronous >> processing mechanism, the asynchronous router can handle a large number of >> requests simultaneously, significantly improving the system's concurrent >> processing ability. >> - High Resource Utilization: It avoids thread blocking and frequent >> switching, reduces thread resource waste, and improves the overall >> efficiency of the system. >> - Isolation: Different ns are processed by different async handler >> thread pools, achieving isolation of different downstream services. Even >> if >> the performance of a certain service is poor, it will not affect the >> processing ability of other services. >> >> V. Summary >> >> The asynchronous router solves the performance bottleneck problem of >> the traditional synchronous router in high - concurrency scenarios by >> introducing an asynchronous processing mechanism. It not only improves the >> system's concurrency ability and resource utilization but also achieves >> isolation of downstream services through the queue mechanism, enhancing >> the >> system's stability and adaptability. In the federated scenarios where >> multiple downstream services need to be processed, the asynchronous router >> is a more efficient and reliable solution. >> VI. Performance Testing >> >> >> >> https://docs.google.com/document/d/1meHOCvhm3XRHlIMwvKFidfUSjveTJrb8yAMasrM_HrY/edit?tab=t.0#heading=h.du0zlo2k5sb1 >> >> VII. JIRA & RPs >> >> For more information, please refer to JIRA: >> JIRA: RBF: Asynchronous router RPC: >> https://issues.apache.org/jira/browse/HDFS-17531 >> PRs: >> HDFS-17543. [ARR] AsyncUtil makes asynchronous code more concise and >> easier. >> HADOOP-19235. IPC client uses CompletableFuture to support >> asynchronous operations. >> HDFS-17544. [ARR] The router client rpc protocol PB supports >> asynchrony. >> HDFS-17545. [ARR] router async rpc client. >> HDFS-17594. [ARR] RouterCacheAdmin supports asynchronous rpc. >> HDFS-17597. [ARR] RouterSnapshot supports asynchronous rpc. >> HDFS-17595. [ARR] ErasureCoding supports asynchronous rpc. >> HDFS-17601. [ARR] RouterRpcServer supports asynchronous rpc. >> HDFS-17596. [ARR] RouterStoragePolicy supports asynchronous rpc. >> HDFS-17656. [ARR] RouterNamenodeProtocol and RouterUserProtocol >> supports asynchronous rpc. >> HDFS-17659. [ARR]Router Quota supports asynchronous rpc. >> HDFS-17672. [ARR] Move asynchronous related classes to the async >> package. >> HADOOP-19361. RPC DeferredMetrics bugfix. >> HDFS-17640.[ARR] RouterClientProtocol supports asynchronous rpc. >> HDFS-17650. [ARR] The router server-side rpc protocol PB supports >> asynchrony. >> HDFS-17651.[ARR] Async handler executor isolation. >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org >> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org >> >> >>