+1    

- Zhang Haobo


---- Replied Message ----
| From | Xiaoqiao He<hexiaoq...@apache.org> |
| Date | 02/14/2025 19:38 |
| To | Ayush Saxena<ayush...@gmail.com> |
| Cc | Hui Fei<feihui.u...@gmail.com> ,
jian zhang<keeprom...@apache.org> ,
Hdfs-dev<hdfs-dev@hadoop.apache.org> ,
<common-...@hadoop.apache.org> ,
<hfutzhan...@163.com> |
| Subject | Re: [VOTE] Request to merge branch HDFS-17531 into trunk. |
+1

Best Regards,
- He Xiaoqiao

On Fri, Feb 14, 2025 at 3:53 PM Ayush Saxena <ayush...@gmail.com> wrote:

+1

-Ayush

On 14 Feb 2025, at 1:20 PM, Hui Fei <feihui.u...@gmail.com> wrote:

Did some test work referring to the documentation.

- Compiled source codes, built a local cluster and the async feature
worked fine.
- it is disabled by default
- can increase or decrease the thread number by changing the related
configurations

+1

jian zhang <keeprom...@apache.org> 于2025年2月14日周五 15:21写道:

Hi, all, the development of the asynchronous router functionality has
been
completed. Thanks to all the contributors who participated in this
feature.
The development branch is HDFS-17531, and it is ready to be merged into
the
trunk branch.

JIRA: HDFS-17531 https://issues.apache.org/jira/browse/HDFS-17531
PR: https://github.com/apache/hadoop/pull/7308

DISCUSS:
https://lists.apache.org/thread/02y3dtpfxt21bxjgmyl3kxnv4m1vwz44

Here is the functionality introduction of the asynchronous router for
everyone to review:
I. Overview

The asynchronous router aims to address the performance bottleneck
issues of the synchronous router in high - concurrency and multi -
nameservices scenarios. By introducing an asynchronous processing
mechanism, it optimizes the request handling process, improves the
system's
concurrency ability and resource utilization, and is particularly
suitable
for the federated scenarios where multiple downstream services (NS)
need to
be processed.

II. Problems of the Synchronous Router

- Performance Bottleneck: The performance of the synchronous router
is
limited by the number of handler threads. Even if the connection thread
can
still forward requests to the downstream namenode, the handler must wait
for each request to complete before processing the next one, resulting
in
limited processing capacity.
- Thread Resource Waste: To improve performance, increasing the
number
of handler threads will lead to more thread switches, which instead
reduces
the system efficiency. At the same time, a large number of handler
threads
are in a blocked state, wasting thread resources.
- Poor Isolation in Multi - ns: If the performance of a certain
nameservice in the downstream nameservice is poor, it will cause the
handler to wait for a long time, thus affecting the forwarding of
requests
to other normal - performance ns, resulting in a decrease in the overall
performance of the downstream ns services perceived by the client.
- Ineffective Utilization of Federation Multi - ns Performance: In
high - concurrency scenarios, a large number of requests may be
backlogged
in the router's request queue, while the queues of downstream services
are
not fully utilized, leading to unreasonable resource allocation.

III. Design and Improvements of the Asynchronous Router

The asynchronous router solves the above problems by redesigning the
request handling process and introducing an asynchronous processing
mechanism. Its core improvements include:

- Handler: Retrieves requests from the request queue for preliminary
processing. If there are exceptions in the request (such as the mount
point
does not exist, etc.), it directly puts the response into the response
queue; otherwise, it sends the request to the asynchronous handler
thread
pool.
- Async Handler: Puts the request into the call queue
(connection.calls) of the connection thread and returns immediately
without
blocking and waiting.
- Async Responder: Is responsible for processing the responses
received by the connection thread. If the request needs to be re -
initiated (such as the downstream service returns a standby exception),
it
re - adds the request to the asynchronous handler thread pool;
otherwise,
it puts the response into the response queue.
- Responder: Retrieves the response from the response queue and
returns it to the client.

IV. Advantages of the Asynchronous Router

- High - Concurrency Performance: Through the asynchronous processing
mechanism, the asynchronous router can handle a large number of requests
simultaneously, significantly improving the system's concurrent
processing
ability.
- High Resource Utilization: It avoids thread blocking and frequent
switching, reduces thread resource waste, and improves the overall
efficiency of the system.
- Isolation: Different ns are processed by different async handler
thread pools, achieving isolation of different downstream services.
Even if
the performance of a certain service is poor, it will not affect the
processing ability of other services.

V. Summary

The asynchronous router solves the performance bottleneck problem of
the traditional synchronous router in high - concurrency scenarios by
introducing an asynchronous processing mechanism. It not only improves
the
system's concurrency ability and resource utilization but also achieves
isolation of downstream services through the queue mechanism, enhancing
the
system's stability and adaptability. In the federated scenarios where
multiple downstream services need to be processed, the asynchronous
router
is a more efficient and reliable solution.
VI. Performance Testing



https://docs.google.com/document/d/1meHOCvhm3XRHlIMwvKFidfUSjveTJrb8yAMasrM_HrY/edit?tab=t.0#heading=h.du0zlo2k5sb1

VII. JIRA & RPs

For more information, please refer to JIRA:
JIRA: RBF: Asynchronous router RPC:
https://issues.apache.org/jira/browse/HDFS-17531
PRs:
HDFS-17543. [ARR] AsyncUtil makes asynchronous code more concise and
easier.
HADOOP-19235. IPC client uses CompletableFuture to support
asynchronous operations.
HDFS-17544. [ARR] The router client rpc protocol PB supports
asynchrony.
HDFS-17545. [ARR] router async rpc client.
HDFS-17594. [ARR] RouterCacheAdmin supports asynchronous rpc.
HDFS-17597. [ARR] RouterSnapshot supports asynchronous rpc.
HDFS-17595. [ARR] ErasureCoding supports asynchronous rpc.
HDFS-17601. [ARR] RouterRpcServer supports asynchronous rpc.
HDFS-17596. [ARR] RouterStoragePolicy supports asynchronous rpc.
HDFS-17656. [ARR] RouterNamenodeProtocol and RouterUserProtocol
supports asynchronous rpc.
HDFS-17659. [ARR]Router Quota supports asynchronous rpc.
HDFS-17672. [ARR] Move asynchronous related classes to the async
package.
HADOOP-19361. RPC DeferredMetrics bugfix.
HDFS-17640.[ARR] RouterClientProtocol supports asynchronous rpc.
HDFS-17650. [ARR] The router server-side rpc protocol PB supports
asynchrony.
HDFS-17651.[ARR] Async handler executor isolation.
HDFS-17715. [ARR] Add documentation for asynchronous router.


---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org


Reply via email to