Hi, Hui Fei, Thank you for your suggestion. It won’t take much time to add a document. I can add a subtask to do this recently. You can focus on the main function first. After I complete the document, I will add a PR for everyone to review.
Best Regards, Jian Zhang Hui Fei <feihui.u...@gmail.com> 于2025年1月22日周三 17:39写道: > Got your idea. Thank you! > - How about removing unfinished tasks and placing them under a new task as > subtasks, like ARR improvements? If this feature is completed but there are > still some open tasks, it looks strange. > - Will it take a long time to add documentation? Discussion may last for > several days. If it takes a long time, I think it may block the trunk > release and All community members need to remember that there is > documentation required. It doesn't look good. That's my thought, and we can > wait for others' opinions > > jian zhang <keeprom...@apache.org> 于2025年1月22日周三 16:53写道: > >> Hi, Hui Fei, >> - The remaining 3 sub tasks are not related to the core functions of >> the asynchronous router, and these sub tasks have little impact on the >> trunk branch, we can wait until HDFS-17531 is merged into the trunk, and >> then submit the remaining PRs directly to the trunk. >> - It is indeed necessary to add a documentation to >> "HDFSRouterFederation.md", how about submitting a PR to do this after >> merging HDFS-17531 into the trunk branch? >> >> Best Regards, >> Jian Zhang >> >> Hui Fei <feihui.u...@gmail.com> 于2025年1月22日周三 16:24写道: >> >>> Thanks for your great work, looking forward to this feature. >>> >>> Some comments from me. >>> - I checked and found that there are still 3 sub tasks under this >>> feature jira ticket, are they necessary to be solved? >>> - I didn't find the documentation for this feature. It's a key feature, >>> Is it necessary to add documentation to HDFSRouterFederation.md? >>> >>> jian zhang <zjkeeprom...@gmail.com> 于2025年1月22日周三 10:29写道: >>> >>>> Hi, all, the development of the asynchronous router functionality has >>>> been completed. The development branch is HDFS-17531, and it is ready to be >>>> merged into the trunk branch. >>>> >>>> JIRA: HDFS-17531 https://issues.apache.org/jira/browse/HDFS-17531 >>>> PR: https://github.com/apache/hadoop/pull/7308 >>>> >>>> Here is the functionality introduction of the asynchronous router for >>>> everyone to review: >>>> I. Overview >>>> >>>> The asynchronous router aims to address the performance bottleneck >>>> issues of the synchronous router in high - concurrency and multi - >>>> nameservices scenarios. By introducing an asynchronous processing >>>> mechanism, it optimizes the request handling process, improves the system's >>>> concurrency ability and resource utilization, and is particularly suitable >>>> for the federated scenarios where multiple downstream services (NS) need to >>>> be processed. >>>> >>>> II. Problems of the Synchronous Router >>>> >>>> - Performance Bottleneck: The performance of the synchronous router >>>> is limited by the number of handler threads. Even if the connection thread >>>> can still forward requests to the downstream namenode, the handler must >>>> wait for each request to complete before processing the next one, resulting >>>> in limited processing capacity. >>>> - Thread Resource Waste: To improve performance, increasing the >>>> number of handler threads will lead to more thread switches, which instead >>>> reduces the system efficiency. At the same time, a large number of handler >>>> threads are in a blocked state, wasting thread resources. >>>> - Poor Isolation in Multi - ns: If the performance of a certain >>>> nameservice in the downstream nameservice is poor, it will cause the >>>> handler to wait for a long time, thus affecting the forwarding of requests >>>> to other normal - performance ns, resulting in a decrease in the overall >>>> performance of the downstream ns services perceived by the client. >>>> - Ineffective Utilization of Federation Multi - ns Performance: In >>>> high - concurrency scenarios, a large number of requests may be backlogged >>>> in the router's request queue, while the queues of downstream services are >>>> not fully utilized, leading to unreasonable resource allocation. >>>> >>>> III. Design and Improvements of the Asynchronous Router >>>> >>>> The asynchronous router solves the above problems by redesigning >>>> the request handling process and introducing an asynchronous processing >>>> mechanism. Its core improvements include: >>>> >>>> - Handler: Retrieves requests from the request queue for >>>> preliminary processing. If there are exceptions in the request (such as the >>>> mount point does not exist, etc.), it directly puts the response into the >>>> response queue; otherwise, it sends the request to the asynchronous handler >>>> thread pool. >>>> - Async Handler: Puts the request into the call queue >>>> (connection.calls) of the connection thread and returns immediately without >>>> blocking and waiting. >>>> - Async Responder: Is responsible for processing the responses >>>> received by the connection thread. If the request needs to be re - >>>> initiated (such as the downstream service returns a standby exception), it >>>> re - adds the request to the asynchronous handler thread pool; otherwise, >>>> it puts the response into the response queue. >>>> - Responder: Retrieves the response from the response queue and >>>> returns it to the client. >>>> >>>> IV. Advantages of the Asynchronous Router >>>> >>>> - High - Concurrency Performance: Through the asynchronous >>>> processing mechanism, the asynchronous router can handle a large number of >>>> requests simultaneously, significantly improving the system's concurrent >>>> processing ability. >>>> - High Resource Utilization: It avoids thread blocking and frequent >>>> switching, reduces thread resource waste, and improves the overall >>>> efficiency of the system. >>>> - Isolation: Different ns are processed by different async handler >>>> thread pools, achieving isolation of different downstream services. Even if >>>> the performance of a certain service is poor, it will not affect the >>>> processing ability of other services. >>>> >>>> V. Summary >>>> >>>> The asynchronous router solves the performance bottleneck problem >>>> of the traditional synchronous router in high - concurrency scenarios by >>>> introducing an asynchronous processing mechanism. It not only improves the >>>> system's concurrency ability and resource utilization but also achieves >>>> isolation of downstream services through the queue mechanism, enhancing the >>>> system's stability and adaptability. In the federated scenarios where >>>> multiple downstream services need to be processed, the asynchronous router >>>> is a more efficient and reliable solution. >>>> VI. Performance Testing >>>> >>>> >>>> https://docs.google.com/document/d/1meHOCvhm3XRHlIMwvKFidfUSjveTJrb8yAMasrM_HrY/edit?tab=t.0#heading=h.du0zlo2k5sb1 >>>> >>>> VII. JIRA & RPs >>>> >>>> For more information, please refer to JIRA: >>>> JIRA: RBF: Asynchronous router RPC: >>>> https://issues.apache.org/jira/browse/HDFS-17531 >>>> PRs: >>>> HDFS-17543. [ARR] AsyncUtil makes asynchronous code more concise >>>> and easier. >>>> HADOOP-19235. IPC client uses CompletableFuture to support >>>> asynchronous operations. >>>> HDFS-17544. [ARR] The router client rpc protocol PB supports >>>> asynchrony. >>>> HDFS-17545. [ARR] router async rpc client. >>>> HDFS-17594. [ARR] RouterCacheAdmin supports asynchronous rpc. >>>> HDFS-17597. [ARR] RouterSnapshot supports asynchronous rpc. >>>> HDFS-17595. [ARR] ErasureCoding supports asynchronous rpc. >>>> HDFS-17601. [ARR] RouterRpcServer supports asynchronous rpc. >>>> HDFS-17596. [ARR] RouterStoragePolicy supports asynchronous rpc. >>>> HDFS-17656. [ARR] RouterNamenodeProtocol and RouterUserProtocol >>>> supports asynchronous rpc. >>>> HDFS-17659. [ARR]Router Quota supports asynchronous rpc. >>>> HDFS-17672. [ARR] Move asynchronous related classes to the async >>>> package. >>>> HADOOP-19361. RPC DeferredMetrics bugfix. >>>> HDFS-17640.[ARR] RouterClientProtocol supports asynchronous rpc. >>>> HDFS-17650. [ARR] The router server-side rpc protocol PB supports >>>> asynchrony. >>>> HDFS-17651.[ARR] Async handler executor isolation. >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org >>>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org >>>> >>>>