Great. +1 from my side. Thanks.

Best Regards,
- He Xiaoqiao

On Thu, Feb 13, 2025 at 10:15 AM jian zhang <keeprom...@apache.org> wrote:

> Hi, He Xiaoqiao
>
> I have rebased HDFS-17531 again and resolved the conflicts. The current
> pipeline failure is unrelated to the ARR feature and was introduced by
> slfan1989's PR: HADOOP-19415. [JDK17] Upgrade JUnit from 4 to 5 in
> hadoop-common Part 1. (#7339). slfan1989 will fix it later.
>
> Best Regards,
> - Jian Zhang
>
> Xiaoqiao He <hexiaoq...@apache.org> 于2025年2月11日周二 11:15写道:
>
> > Hi Jian Zhang, Thanks for your great work. Please fix the conflict first,
> > others make sense to me.
> > I will give my +1 once it is ready.
> > Another thing, before check in we need to launch another official vote
> > thread. Good luck.
> >
> > BTW, Happy lunar new year!
> >
> > Best Regards,
> > - He Xiaoqiao
> >
> > On Thu, Feb 6, 2025 at 5:30 PM jian zhang <keeprom...@apache.org> wrote:
> >
> >> Hi, all,
> >> Currently this feature has been developed and passed the pipeline.
> Please
> >> continue to help review this feature.
> >>
> >> Best Regards,
> >> Jian Zhang
> >>
> >> Zhanghaobo <hfutzhan...@163.com> 于2025年1月22日周三 18:22写道:
> >>
> >>> @Hui Fei  Hi, Sir:
> >>>   For the first opinion, I have create an umbrella JIRA
> >>> https://issues.apache.org/jira/browse/HDFS-17716
> >>> and move non-core JIRA under it.
> >>>
> >>> Best Wishes
> >>> Haobo Zhang
> >>>
> >>> ---- Replied Message ----
> >>> From Hui Fei<feihui.u...@gmail.com> <feihui.u...@gmail.com>
> >>> Date 01/22/2025 17:37
> >>> To jian zhang<keeprom...@apache.org> <keeprom...@apache.org>
> >>> Cc Hdfs-dev<hdfs-dev@hadoop.apache.org> ,
> >>> <hdfs-dev@hadoop.apache.org> <priv...@hadoop.apache.org> ,
> >>> <priv...@hadoop.apache.org> Xiaoqiao He<hexiaoq...@apache.org> ,
> >>> <hexiaoq...@apache.org> <common-...@hadoop.apache.org>
> >>> <common-...@hadoop.apache.org>
> >>> Subject Re: [DISCUSS] Request to merge branch HDFS-17531 into trunk.
> >>> Got your idea. Thank you!
> >>> - How about removing unfinished tasks and placing them under a new task
> >>> as
> >>> subtasks, like ARR improvements? If this feature is completed but there
> >>> are
> >>> still some open tasks, it looks strange.
> >>> - Will it take a long time to add documentation? Discussion may last
> for
> >>> several days. If it takes a long time, I think it may block the trunk
> >>> release and All community members need to remember that there is
> >>> documentation required. It doesn't look good. That's my thought, and we
> >>> can
> >>> wait for others' opinions
> >>>
> >>> jian zhang <keeprom...@apache.org> 于2025年1月22日周三 16:53写道:
> >>>
> >>> Hi, Hui Fei,
> >>> - The remaining 3 sub tasks are not related to the core functions of
> >>> the asynchronous router, and these sub tasks have little impact on the
> >>> trunk branch, we can wait until HDFS-17531 is merged into the trunk,
> and
> >>> then submit the remaining PRs directly to the trunk.
> >>> - It is indeed necessary to add a documentation to
> >>> "HDFSRouterFederation.md", how about submitting a PR to do this after
> >>> merging HDFS-17531 into the trunk branch?
> >>>
> >>> Best Regards,
> >>> Jian Zhang
> >>>
> >>> Hui Fei <feihui.u...@gmail.com> 于2025年1月22日周三 16:24写道:
> >>>
> >>> Thanks for your great work, looking forward to this feature.
> >>>
> >>> Some comments from me.
> >>> - I checked and found that there are still 3 sub tasks under this
> >>> feature jira ticket, are they necessary to be solved?
> >>> - I didn't find the documentation for this feature. It's a key feature,
> >>> Is it necessary to add documentation to HDFSRouterFederation.md?
> >>>
> >>> jian zhang <zjkeeprom...@gmail.com> 于2025年1月22日周三 10:29写道:
> >>>
> >>> Hi, all, the development of the asynchronous router functionality has
> >>> been completed. The development branch is HDFS-17531, and it is ready
> to
> >>> be
> >>> merged into the trunk branch.
> >>>
> >>> JIRA: HDFS-17531 https://issues.apache.org/jira/browse/HDFS-17531
> >>> PR: https://github.com/apache/hadoop/pull/7308
> >>>
> >>> Here is the functionality introduction of the asynchronous router for
> >>> everyone to review:
> >>> I. Overview
> >>>
> >>> The asynchronous router aims to address the performance bottleneck
> >>> issues of the synchronous router in high - concurrency and multi -
> >>> nameservices scenarios. By introducing an asynchronous processing
> >>> mechanism, it optimizes the request handling process, improves the
> >>> system's
> >>> concurrency ability and resource utilization, and is particularly
> >>> suitable
> >>> for the federated scenarios where multiple downstream services (NS)
> need
> >>> to
> >>> be processed.
> >>>
> >>> II. Problems of the Synchronous Router
> >>>
> >>> - Performance Bottleneck: The performance of the synchronous router
> >>> is limited by the number of handler threads. Even if the connection
> >>> thread
> >>> can still forward requests to the downstream namenode, the handler must
> >>> wait for each request to complete before processing the next one,
> >>> resulting
> >>> in limited processing capacity.
> >>> - Thread Resource Waste: To improve performance, increasing the
> >>> number of handler threads will lead to more thread switches, which
> >>> instead
> >>> reduces the system efficiency. At the same time, a large number of
> >>> handler
> >>> threads are in a blocked state, wasting thread resources.
> >>> - Poor Isolation in Multi - ns: If the performance of a certain
> >>> nameservice in the downstream nameservice is poor, it will cause the
> >>> handler to wait for a long time, thus affecting the forwarding of
> >>> requests
> >>> to other normal - performance ns, resulting in a decrease in the
> overall
> >>> performance of the downstream ns services perceived by the client.
> >>> - Ineffective Utilization of Federation Multi - ns Performance: In
> >>> high - concurrency scenarios, a large number of requests may be
> >>> backlogged
> >>> in the router's request queue, while the queues of downstream services
> >>> are
> >>> not fully utilized, leading to unreasonable resource allocation.
> >>>
> >>> III. Design and Improvements of the Asynchronous Router
> >>>
> >>> The asynchronous router solves the above problems by redesigning the
> >>> request handling process and introducing an asynchronous processing
> >>> mechanism. Its core improvements include:
> >>>
> >>> - Handler: Retrieves requests from the request queue for preliminary
> >>> processing. If there are exceptions in the request (such as the mount
> >>> point
> >>> does not exist, etc.), it directly puts the response into the response
> >>> queue; otherwise, it sends the request to the asynchronous handler
> thread
> >>> pool.
> >>> - Async Handler: Puts the request into the call queue
> >>> (connection.calls) of the connection thread and returns immediately
> >>> without
> >>> blocking and waiting.
> >>> - Async Responder: Is responsible for processing the responses
> >>> received by the connection thread. If the request needs to be re -
> >>> initiated (such as the downstream service returns a standby exception),
> >>> it
> >>> re - adds the request to the asynchronous handler thread pool;
> otherwise,
> >>> it puts the response into the response queue.
> >>> - Responder: Retrieves the response from the response queue and
> >>> returns it to the client.
> >>>
> >>> IV. Advantages of the Asynchronous Router
> >>>
> >>> - High - Concurrency Performance: Through the asynchronous
> >>> processing mechanism, the asynchronous router can handle a large number
> >>> of
> >>> requests simultaneously, significantly improving the system's
> concurrent
> >>> processing ability.
> >>> - High Resource Utilization: It avoids thread blocking and frequent
> >>> switching, reduces thread resource waste, and improves the overall
> >>> efficiency of the system.
> >>> - Isolation: Different ns are processed by different async handler
> >>> thread pools, achieving isolation of different downstream services.
> Even
> >>> if
> >>> the performance of a certain service is poor, it will not affect the
> >>> processing ability of other services.
> >>>
> >>> V. Summary
> >>>
> >>> The asynchronous router solves the performance bottleneck problem of
> >>> the traditional synchronous router in high - concurrency scenarios by
> >>> introducing an asynchronous processing mechanism. It not only improves
> >>> the
> >>> system's concurrency ability and resource utilization but also achieves
> >>> isolation of downstream services through the queue mechanism, enhancing
> >>> the
> >>> system's stability and adaptability. In the federated scenarios where
> >>> multiple downstream services need to be processed, the asynchronous
> >>> router
> >>> is a more efficient and reliable solution.
> >>> VI. Performance Testing
> >>>
> >>>
> >>>
> >>>
> https://docs.google.com/document/d/1meHOCvhm3XRHlIMwvKFidfUSjveTJrb8yAMasrM_HrY/edit?tab=t.0#heading=h.du0zlo2k5sb1
> >>>
> >>> VII. JIRA & RPs
> >>>
> >>> For more information, please refer to JIRA:
> >>> JIRA: RBF: Asynchronous router RPC:
> >>> https://issues.apache.org/jira/browse/HDFS-17531
> >>> PRs:
> >>> HDFS-17543. [ARR] AsyncUtil makes asynchronous code more concise and
> >>> easier.
> >>> HADOOP-19235. IPC client uses CompletableFuture to support
> >>> asynchronous operations.
> >>> HDFS-17544. [ARR] The router client rpc protocol PB supports
> >>> asynchrony.
> >>> HDFS-17545. [ARR] router async rpc client.
> >>> HDFS-17594. [ARR] RouterCacheAdmin supports asynchronous rpc.
> >>> HDFS-17597. [ARR] RouterSnapshot supports asynchronous rpc.
> >>> HDFS-17595. [ARR] ErasureCoding supports asynchronous rpc.
> >>> HDFS-17601. [ARR] RouterRpcServer supports asynchronous rpc.
> >>> HDFS-17596. [ARR] RouterStoragePolicy supports asynchronous rpc.
> >>> HDFS-17656. [ARR] RouterNamenodeProtocol and RouterUserProtocol
> >>> supports asynchronous rpc.
> >>> HDFS-17659. [ARR]Router Quota supports asynchronous rpc.
> >>> HDFS-17672. [ARR] Move asynchronous related classes to the async
> >>> package.
> >>> HADOOP-19361. RPC DeferredMetrics bugfix.
> >>> HDFS-17640.[ARR] RouterClientProtocol supports asynchronous rpc.
> >>> HDFS-17650. [ARR] The router server-side rpc protocol PB supports
> >>> asynchrony.
> >>> HDFS-17651.[ARR] Async handler executor isolation.
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> >>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> >>>
> >>>
> >>>
>

Reply via email to