Got your idea. Thank you!
- How about removing unfinished tasks and placing them under a new task as
subtasks, like ARR improvements? If this feature is completed but there are
still some open tasks, it looks strange.
- Will it take a long time to add documentation? Discussion may last for
several days. If it takes a long time, I think it may block the trunk
release and All community members need to remember that there is
documentation required. It doesn't look good. That's my thought, and we can
wait for others' opinions

jian zhang <keeprom...@apache.org> 于2025年1月22日周三 16:53写道:

> Hi, Hui Fei,
>     - The remaining 3 sub tasks are not related to the core functions of
> the asynchronous router, and these sub tasks have little impact on the
> trunk branch, we can wait until HDFS-17531 is merged into the trunk, and
> then submit the remaining PRs directly to the trunk.
>     - It is indeed necessary to add a documentation to
> "HDFSRouterFederation.md", how about submitting a PR to do this after
> merging HDFS-17531 into the trunk branch?
>
> Best Regards,
> Jian Zhang
>
> Hui Fei <feihui.u...@gmail.com> 于2025年1月22日周三 16:24写道:
>
>> Thanks for your great work, looking forward to this feature.
>>
>> Some comments from me.
>>  - I checked and found that there are still 3 sub tasks under this
>> feature jira ticket, are they necessary to be solved?
>>  - I didn't find the documentation for this feature. It's a key feature,
>> Is it necessary to add documentation to HDFSRouterFederation.md?
>>
>> jian zhang <zjkeeprom...@gmail.com> 于2025年1月22日周三 10:29写道:
>>
>>> Hi, all, the development of the asynchronous router functionality has
>>> been completed. The development branch is HDFS-17531, and it is ready to be
>>> merged into the trunk branch.
>>>
>>> JIRA: HDFS-17531 https://issues.apache.org/jira/browse/HDFS-17531
>>> PR: https://github.com/apache/hadoop/pull/7308
>>>
>>> Here is the functionality introduction of the asynchronous router for
>>> everyone to review:
>>> I. Overview
>>>
>>>     The asynchronous router aims to address the performance bottleneck
>>> issues of the synchronous router in high - concurrency and multi -
>>> nameservices scenarios. By introducing an asynchronous processing
>>> mechanism, it optimizes the request handling process, improves the system's
>>> concurrency ability and resource utilization, and is particularly suitable
>>> for the federated scenarios where multiple downstream services (NS) need to
>>> be processed.
>>>
>>> II. Problems of the Synchronous Router
>>>
>>>     - Performance Bottleneck: The performance of the synchronous router
>>> is limited by the number of handler threads. Even if the connection thread
>>> can still forward requests to the downstream namenode, the handler must
>>> wait for each request to complete before processing the next one, resulting
>>> in limited processing capacity.
>>>     - Thread Resource Waste: To improve performance, increasing the
>>> number of handler threads will lead to more thread switches, which instead
>>> reduces the system efficiency. At the same time, a large number of handler
>>> threads are in a blocked state, wasting thread resources.
>>>     - Poor Isolation in Multi - ns: If the performance of a certain
>>> nameservice in the downstream nameservice is poor, it will cause the
>>> handler to wait for a long time, thus affecting the forwarding of requests
>>> to other normal - performance ns, resulting in a decrease in the overall
>>> performance of the downstream ns services perceived by the client.
>>>     - Ineffective Utilization of Federation Multi - ns Performance: In
>>> high - concurrency scenarios, a large number of requests may be backlogged
>>> in the router's request queue, while the queues of downstream services are
>>> not fully utilized, leading to unreasonable resource allocation.
>>>
>>> III. Design and Improvements of the Asynchronous Router
>>>
>>>     The asynchronous router solves the above problems by redesigning the
>>> request handling process and introducing an asynchronous processing
>>> mechanism. Its core improvements include:
>>>
>>>     - Handler: Retrieves requests from the request queue for preliminary
>>> processing. If there are exceptions in the request (such as the mount point
>>> does not exist, etc.), it directly puts the response into the response
>>> queue; otherwise, it sends the request to the asynchronous handler thread
>>> pool.
>>>     - Async Handler: Puts the request into the call queue
>>> (connection.calls) of the connection thread and returns immediately without
>>> blocking and waiting.
>>>     - Async Responder: Is responsible for processing the responses
>>> received by the connection thread. If the request needs to be re -
>>> initiated (such as the downstream service returns a standby exception), it
>>> re - adds the request to the asynchronous handler thread pool; otherwise,
>>> it puts the response into the response queue.
>>>     - Responder: Retrieves the response from the response queue and
>>> returns it to the client.
>>>
>>> IV. Advantages of the Asynchronous Router
>>>
>>>     - High - Concurrency Performance: Through the asynchronous
>>> processing mechanism, the asynchronous router can handle a large number of
>>> requests simultaneously, significantly improving the system's concurrent
>>> processing ability.
>>>     - High Resource Utilization: It avoids thread blocking and frequent
>>> switching, reduces thread resource waste, and improves the overall
>>> efficiency of the system.
>>>     - Isolation: Different ns are processed by different async handler
>>> thread pools, achieving isolation of different downstream services. Even if
>>> the performance of a certain service is poor, it will not affect the
>>> processing ability of other services.
>>>
>>> V. Summary
>>>
>>>     The asynchronous router solves the performance bottleneck problem of
>>> the traditional synchronous router in high - concurrency scenarios by
>>> introducing an asynchronous processing mechanism. It not only improves the
>>> system's concurrency ability and resource utilization but also achieves
>>> isolation of downstream services through the queue mechanism, enhancing the
>>> system's stability and adaptability. In the federated scenarios where
>>> multiple downstream services need to be processed, the asynchronous router
>>> is a more efficient and reliable solution.
>>> VI. Performance Testing
>>>
>>>
>>> https://docs.google.com/document/d/1meHOCvhm3XRHlIMwvKFidfUSjveTJrb8yAMasrM_HrY/edit?tab=t.0#heading=h.du0zlo2k5sb1
>>>
>>> VII. JIRA & RPs
>>>
>>>     For more information, please refer to JIRA:
>>>     JIRA: RBF: Asynchronous router RPC:
>>> https://issues.apache.org/jira/browse/HDFS-17531
>>>     PRs:
>>>     HDFS-17543. [ARR] AsyncUtil makes asynchronous code more concise and
>>> easier.
>>>     HADOOP-19235. IPC client uses CompletableFuture to support
>>> asynchronous operations.
>>>     HDFS-17544. [ARR] The router client rpc protocol PB supports
>>> asynchrony.
>>>     HDFS-17545. [ARR] router async rpc client.
>>>     HDFS-17594. [ARR] RouterCacheAdmin supports asynchronous rpc.
>>>     HDFS-17597. [ARR] RouterSnapshot supports asynchronous rpc.
>>>     HDFS-17595. [ARR] ErasureCoding supports asynchronous rpc.
>>>     HDFS-17601. [ARR] RouterRpcServer supports asynchronous rpc.
>>>     HDFS-17596. [ARR] RouterStoragePolicy supports asynchronous rpc.
>>>     HDFS-17656. [ARR] RouterNamenodeProtocol and RouterUserProtocol
>>> supports asynchronous rpc.
>>>     HDFS-17659. [ARR]Router Quota supports asynchronous rpc.
>>>     HDFS-17672. [ARR] Move asynchronous related classes to the async
>>> package.
>>>     HADOOP-19361. RPC DeferredMetrics bugfix.
>>>     HDFS-17640.[ARR] RouterClientProtocol supports asynchronous rpc.
>>>     HDFS-17650. [ARR] The router server-side rpc protocol PB supports
>>> asynchrony.
>>>     HDFS-17651.[ARR] Async handler executor isolation.
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
>>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>>>
>>>

Reply via email to