Thanks for your great work, looking forward to this feature.

Some comments from me.
 - I checked and found that there are still 3 sub tasks under this feature
jira ticket, are they necessary to be solved?
 - I didn't find the documentation for this feature. It's a key feature, Is
it necessary to add documentation to HDFSRouterFederation.md?

jian zhang <zjkeeprom...@gmail.com> 于2025年1月22日周三 10:29写道:

> Hi, all, the development of the asynchronous router functionality has been
> completed. The development branch is HDFS-17531, and it is ready to be
> merged into the trunk branch.
>
> JIRA: HDFS-17531 https://issues.apache.org/jira/browse/HDFS-17531
> PR: https://github.com/apache/hadoop/pull/7308
>
> Here is the functionality introduction of the asynchronous router for
> everyone to review:
> I. Overview
>
>     The asynchronous router aims to address the performance bottleneck
> issues of the synchronous router in high - concurrency and multi -
> nameservices scenarios. By introducing an asynchronous processing
> mechanism, it optimizes the request handling process, improves the system's
> concurrency ability and resource utilization, and is particularly suitable
> for the federated scenarios where multiple downstream services (NS) need to
> be processed.
>
> II. Problems of the Synchronous Router
>
>     - Performance Bottleneck: The performance of the synchronous router is
> limited by the number of handler threads. Even if the connection thread can
> still forward requests to the downstream namenode, the handler must wait
> for each request to complete before processing the next one, resulting in
> limited processing capacity.
>     - Thread Resource Waste: To improve performance, increasing the number
> of handler threads will lead to more thread switches, which instead reduces
> the system efficiency. At the same time, a large number of handler threads
> are in a blocked state, wasting thread resources.
>     - Poor Isolation in Multi - ns: If the performance of a certain
> nameservice in the downstream nameservice is poor, it will cause the
> handler to wait for a long time, thus affecting the forwarding of requests
> to other normal - performance ns, resulting in a decrease in the overall
> performance of the downstream ns services perceived by the client.
>     - Ineffective Utilization of Federation Multi - ns Performance: In
> high - concurrency scenarios, a large number of requests may be backlogged
> in the router's request queue, while the queues of downstream services are
> not fully utilized, leading to unreasonable resource allocation.
>
> III. Design and Improvements of the Asynchronous Router
>
>     The asynchronous router solves the above problems by redesigning the
> request handling process and introducing an asynchronous processing
> mechanism. Its core improvements include:
>
>     - Handler: Retrieves requests from the request queue for preliminary
> processing. If there are exceptions in the request (such as the mount point
> does not exist, etc.), it directly puts the response into the response
> queue; otherwise, it sends the request to the asynchronous handler thread
> pool.
>     - Async Handler: Puts the request into the call queue
> (connection.calls) of the connection thread and returns immediately without
> blocking and waiting.
>     - Async Responder: Is responsible for processing the responses
> received by the connection thread. If the request needs to be re -
> initiated (such as the downstream service returns a standby exception), it
> re - adds the request to the asynchronous handler thread pool; otherwise,
> it puts the response into the response queue.
>     - Responder: Retrieves the response from the response queue and
> returns it to the client.
>
> IV. Advantages of the Asynchronous Router
>
>     - High - Concurrency Performance: Through the asynchronous processing
> mechanism, the asynchronous router can handle a large number of requests
> simultaneously, significantly improving the system's concurrent processing
> ability.
>     - High Resource Utilization: It avoids thread blocking and frequent
> switching, reduces thread resource waste, and improves the overall
> efficiency of the system.
>     - Isolation: Different ns are processed by different async handler
> thread pools, achieving isolation of different downstream services. Even if
> the performance of a certain service is poor, it will not affect the
> processing ability of other services.
>
> V. Summary
>
>     The asynchronous router solves the performance bottleneck problem of
> the traditional synchronous router in high - concurrency scenarios by
> introducing an asynchronous processing mechanism. It not only improves the
> system's concurrency ability and resource utilization but also achieves
> isolation of downstream services through the queue mechanism, enhancing the
> system's stability and adaptability. In the federated scenarios where
> multiple downstream services need to be processed, the asynchronous router
> is a more efficient and reliable solution.
> VI. Performance Testing
>
>
> https://docs.google.com/document/d/1meHOCvhm3XRHlIMwvKFidfUSjveTJrb8yAMasrM_HrY/edit?tab=t.0#heading=h.du0zlo2k5sb1
>
> VII. JIRA & RPs
>
>     For more information, please refer to JIRA:
>     JIRA: RBF: Asynchronous router RPC:
> https://issues.apache.org/jira/browse/HDFS-17531
>     PRs:
>     HDFS-17543. [ARR] AsyncUtil makes asynchronous code more concise and
> easier.
>     HADOOP-19235. IPC client uses CompletableFuture to support
> asynchronous operations.
>     HDFS-17544. [ARR] The router client rpc protocol PB supports
> asynchrony.
>     HDFS-17545. [ARR] router async rpc client.
>     HDFS-17594. [ARR] RouterCacheAdmin supports asynchronous rpc.
>     HDFS-17597. [ARR] RouterSnapshot supports asynchronous rpc.
>     HDFS-17595. [ARR] ErasureCoding supports asynchronous rpc.
>     HDFS-17601. [ARR] RouterRpcServer supports asynchronous rpc.
>     HDFS-17596. [ARR] RouterStoragePolicy supports asynchronous rpc.
>     HDFS-17656. [ARR] RouterNamenodeProtocol and RouterUserProtocol
> supports asynchronous rpc.
>     HDFS-17659. [ARR]Router Quota supports asynchronous rpc.
>     HDFS-17672. [ARR] Move asynchronous related classes to the async
> package.
>     HADOOP-19361. RPC DeferredMetrics bugfix.
>     HDFS-17640.[ARR] RouterClientProtocol supports asynchronous rpc.
>     HDFS-17650. [ARR] The router server-side rpc protocol PB supports
> asynchrony.
>     HDFS-17651.[ARR] Async handler executor isolation.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>

Reply via email to