Re: [Discuss] RBF: Aynchronous router RPC.

Yuanbo Liu Thu, 23 May 2024 23:15:30 -0700

good job!

On Fri, May 24, 2024 at 1:57 AM zhangjian <1361320...@qq.com> wrote:


> Hello everyone, currently, I have tested the performance of async and sync
> router for a downstream ns:
> 1. The throughput, CPU, and thread performance of the async router are
> better than those of the sync router, and its memory performance is within
> an acceptable range compared to the synchronous router.
> 2. Asynchronous router can apply pressure downstream to better utilize the
> performance of downstream ns, and can almost fill the call queue of
> downstream ns.
>
> Due to the large size of the test result pdf, it cannot be sent via email,
> please see: https://issues.apache.org/jira/browse/HDFS-17531
>
> > 2024年5月23日 17:03，Xiaoqiao He <hexiaoq...@apache.org> 写道：
> >
> > Great. Thanks for your addendum information.
> >
> > cc @Ayush Saxena <ayush...@gmail.com> @inigo...@apache.org
> > <inigo...@apache.org> Any more feedback for this proposal?
> >
> > IMO The feature of asynchronous router RPC is a helpful improvement. For
> my
> > internal practice, it will improve the throughput of requests forward
> > significantly
> > and is very valuable to push it forward.
> > Thanks again and good luck!
> >
> > Best Regards,
> > - He Xiaoqiao
> >
> > On Wed, May 22, 2024 at 9:59 AM zhangjian <1361320...@qq.com> wrote:
> >
> >> Hi, Sangjin Lee, thank you for your attention. I will use my free time
> to
> >> do a performance comparison recently.
> >>
> >>> 2024年5月22日 03:42，Sangjin Lee <sj...@apache.org> 写道：
> >>>
> >>> Thanks for the great proposal, Zhangjian. On point #3, I suspect it
> >> should
> >>> be fairly straightforward to create a small isolated synthetic test to
> >>> prove (or disprove) the benefits of this approach. By driving a
> >> controlled
> >>> amount of requests per second, you could see latency, memory, CPU, etc.
> >>> Ideally, it should show meaningful improvements without much
> degradation
> >> in
> >>> other metrics. Would you be able to spend some time doing that?
> >>>
> >>> Thanks,
> >>> Sangjin
> >>>
> >>> On Tue, May 21, 2024 at 5:13 AM zhangjian <1361320...@qq.com.invalid>
> >> wrote:
> >>>
> >>>> Hi, xiaoqiao he, thank you for your reply.
> >>>>
> >>>> 1.Currently, the server and client protocols within router can be
> >>>> implemented by extends existing protocols and adding asynchronous
> >>>> functionality, so it will not affect existing synchronization
> protocols.
> >>>> RouterClientNamenodeProtocolServerSideTranslatorPB
> >>>> RouterClientProtocolTranslatorPB
> >>>> RouterGetUserMappingsProtocolServerSideTranslatorPB
> >>>> RouterGetUserMappingsProtocolTranslatorPB
> >>>> RouterNamenodeProtocolServerSideTranslatorPB
> >>>> RouterNamenodeProtocolTranslatorPB
> >>>> RouterRefreshUserMappingsProtocolServerSideTranslatorPB
> >>>> RouterRefreshUserMappingsProtocolTranslatorPB
> >>>>
> >>>> The following issues have implemented asynchronous callbacks for
> >>>> Rpc.server, but I have not found any other modules to use related
> >> functions
> >>>> Server HADOOP-11552 HADOOP-17046
> >>>> In the implementation of asynchronous Rpc.client, this issue is
> directly
> >>>> used
> >>>> Client HADOOP-13226
> >>>> Therefore, I believe that asynchronous routers are safe for modifying
> >> the
> >>>> RPC protocol, RPC server, and client
> >>>>
> >>>> 2. Forwarding requests to multiple downstream ns, the synchronous
> router
> >>>> handler adds requests from multiple downstream ns to the thread pool
> >>>> (RouterRpcClient.executorService), and then waits for responses from
> all
> >>>> downstream ns before returning. Since threads in the thread pool also
> >>>> process rpc requests synchronously, similar to a handler, the number
> of
> >>>> threads in the thread pool directly affects the performance of
> >>>> invoiceConcurrent, which in turn affects the performance of the
> handler.
> >>>> In asynchronous router implementation, the handler calls
> >> invoiceConcurrent
> >>>> to simply convert a request into multiple requests and add them to the
> >> asyn
> >>>> handler thread pool, which can then process the next request in the
> call
> >>>> queue; When a connection thread of a downstream ns receives a
> response,
> >> it
> >>>> will hand it over to the async response for processing. The async
> >> response
> >>>> thread will determine whether it has received all responses from the
> >>>> downstream ns. If it does, it will continue to process the response.
> >>>> Otherwise, the async response thread will process the next response.
> The
> >>>> asynchronous router uses CompletableFuture.allOf() to implement
> >>>> asynchronous invoiceConcurrent, and the handler, async handler, async
> >>>> response, and connection thread still does not need to wait
> >> synchronously.
> >>>> In addition, synchronous routers not only have drawbacks in multi ns
> >>>> environments, but also in single downstream ns situations, it is often
> >>>> difficult to decide how many handlers to set for the router, setting
> it
> >> too
> >>>> much will waste thread resources, and setting it too small will not be
> >> able
> >>>> to give pressure to downstream ns; Asynchronous routers can push
> >> requests
> >>>> to downstream ns without considering how to set handlers. Asynchronous
> >>>> routers can also better connect to more downstream storage services
> that
> >>>> support the HDFS protocol, with better scalability.
> >>>>
> >>>> 3.Since I have not yet deployed asynchronous routers to our own
> cluster,
> >>>> there is no performance comparison. However, theoretically, I believe
> >> that
> >>>> asynchronous routers will occupy more memory than synchronous routers.
> >>>> However, I do not believe that it will occupy a lot, especially since
> we
> >>>> can control the maximum number of requests entering the router, as
> >>>> CompletableFuture is stable and widely used; In other aspects, it
> >> should be
> >>>> far superior to synchronous routers, especially in downstream
> scenarios
> >>>> with more ns.If anyone is interested, you can also help to make a
> >>>> performance comparison
> >>>>
> >>>>> 2024年5月21日 11:39，Xiaoqiao He <hexiaoq...@apache.org> 写道：
> >>>>>
> >>>>> Thanks for this great proposal!
> >>>>>
> >>>>> Some questions after reviewing the design doc (sorry didn't review PR
> >>>>> carefully which is too large.)
> >>>>> 1. This solution will involve RPC framework update, will it affect
> >> other
> >>>>> modules and how to
> >>>>> keep other modules off these changes.
> >>>>> 2. Some RPC requests should be forward concurrently to all downstream
> >> NS,
> >>>>> will it cover
> >>>>> this case in this solution.
> >>>>> 3. Considering there is one init-version implementation, did you
> >> collect
> >>>>> some benchmark vs
> >>>>> the current synchronous model of DFSRouter?
> >>>>> Thanks again.
> >>>>>
> >>>>> Best Regards,
> >>>>> - He Xiaoqiao
> >>>>>
> >>>>> On Tue, May 21, 2024 at 11:21 AM zhangjian <1361320...@qq.com.invalid
> >
> >>>>> wrote:
> >>>>>
> >>>>>> Thank you for your positive attitude towards this feature. You can
> >> debug
> >>>>>> the UTs provided in PR to better understand the current asynchronous
> >>>>>> calling function.
> >>>>>>
> >>>>>>> 2024年5月21日 02:04，Simbarashe Dzinamarira <simbadz...@apache.org>
> 写道：
> >>>>>>>
> >>>>>>> Excited to see this feature as well. I'll spend more time
> >> understanding
> >>>>>> the
> >>>>>>> proposal and implementation.
> >>>>>>>
> >>>>>>> On Mon, May 20, 2024 at 7:55 AM zhangjian
> <1361320...@qq.com.invalid
> >>>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>>> Hi, Yuanbo liu,  thank you for your interest in this feature, I
> >> think
> >>>>>> the
> >>>>>>>> difficulty of an asynchronous router is not only to implement
> >>>>>> asynchronous
> >>>>>>>> functions, but also to consider the readability and reusability of
> >> the
> >>>>>>>> code, so as to facilitate the development of the community. I also
> >>>>>> planned
> >>>>>>>> to do the virtual thread you mentioned at the beginning, virtual
> >>>> Threads
> >>>>>>>> can achieve asynchronousization elegantly at the code level, but
> the
> >>>>>>>> biggest problem is that it is not easy to upgrade the jdk version,
> >> no
> >>>>>>>> matter in the community or in the actual production environment.
> >>>>>> Therefore,
> >>>>>>>> I later used CompletableFuture, which is currently supported by
> jdk
> >> 8,
> >>>>>> to
> >>>>>>>> achieve asynchronousization. The router is stateless, and the
> router
> >>>> rpc
> >>>>>>>> process is very clear. Therefore, even if CompletableFuture itself
> >> is
> >>>>>> not
> >>>>>>>> as readable as the virtual thread, if we design it well, we can
> make
> >>>> the
> >>>>>>>> asynchronous process look very clear.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> 2024年5月20日 10:56，Yuanbo Liu <liuyuanb...@gmail.com> 写道：
> >>>>>>>>>
> >>>>>>>>> Nice to see this feature brought up. I tried to implement this
> >>>> feature
> >>>>>> in
> >>>>>>>>> our internal clusters, and know that it's a very complicated
> >> feature,
> >>>>>> CC
> >>>>>>>>> hdfs-dev to bring more discussion.
> >>>>>>>>> By the way, I'm not sure whether virtual thread of higher jdk
> will
> >>>> help
> >>>>>>>> in
> >>>>>>>>> this case.
> >>>>>>>>>
> >>>>>>>>> On Mon, May 20, 2024 at 10:10 AM zhangjian
> >> <1361320...@qq.com.invalid
> >>>>>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Hello everyone, currently there are some shortcomings in the RPC
> >> of
> >>>>>> HDFS
> >>>>>>>>>> router：
> >>>>>>>>>>
> >>>>>>>>>> Currently the router's handler thread is synchronized, when the
> >>>>>>>> *handler* thread
> >>>>>>>>>> adds the call to connection.calls, it needs to wait until the
> >>>>>>>> *connection* notifies
> >>>>>>>>>> the call to complete, and then Only after the response is put
> into
> >>>> the
> >>>>>>>>>> response queue can a new call be obtained from the call queue
> and
> >>>>>>>>>> processed. Therefore, the concurrency performance of the router
> is
> >>>>>>>> limited
> >>>>>>>>>> by the number of handlers; a simple example is as follows: If
> the
> >>>>>>>> number of
> >>>>>>>>>> handlers is 1 and the maximum number of calls in the connection
> >>>> thread
> >>>>>>>> is
> >>>>>>>>>> 10, then even if the connection thread can send 10 requests to
> the
> >>>>>>>>>> downstream ns, since the number of handlers is 1, the router can
> >>>> only
> >>>>>>>>>> process one request after another.
> >>>>>>>>>>
> >>>>>>>>>> Since the performance of router rpc is mainly limited by the
> >> number
> >>>> of
> >>>>>>>>>> handlers, the most effective way to improve rpc performance
> >>>> currently
> >>>>>>>> is to
> >>>>>>>>>> increase the number of handlers. Letting the router create a
> large
> >>>>>>>> number
> >>>>>>>>>> of handler threads will also increase the number of thread
> >> switches
> >>>>>> and
> >>>>>>>>>> cannot maximize the use of machine performance.
> >>>>>>>>>>
> >>>>>>>>>> There are usually multiple ns downstream of the router. If the
> >>>> handler
> >>>>>>>>>> forwards the request to an ns with poor performance, it will
> cause
> >>>> the
> >>>>>>>>>> handler to wait for a long time. Due to the reduction of
> available
> >>>>>>>>>> handlers, the router's ability to handle ns requests with normal
> >>>>>>>>>> performance will be reduced. From the perspective of the client,
> >> the
> >>>>>>>>>> performance of the downstream ns of the router has deteriorated
> at
> >>>>>> this
> >>>>>>>>>> time. We often find that the call queue of the downstream ns is
> >> not
> >>>>>>>> high,
> >>>>>>>>>> but the call queue of the router is very high.
> >>>>>>>>>>
> >>>>>>>>>> Therefore, although the main function of the router is to
> federate
> >>>> and
> >>>>>>>>>> handle requests from multiple NSs, the current synchronous RPC
> >>>>>>>> performance
> >>>>>>>>>> cannot satisfy the scenario where there are many NSs downstream
> of
> >>>> the
> >>>>>>>>>> router. Even if the concurrent performance of the router can be
> >>>>>>>> improved by
> >>>>>>>>>> increasing the number of handlers, it is still relatively slow.
> >> More
> >>>>>>>>>> threads will increase the CPU context switching time, and in
> fact
> >>>> many
> >>>>>>>> of
> >>>>>>>>>> the handler threads are in a blocked state, which is
> undoubtedly a
> >>>>>>>> waste of
> >>>>>>>>>> thread resources. When a request enters the router, there is no
> >>>>>>>> guarantee
> >>>>>>>>>> that there will be a running handler at this time.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Therefore, I consider asynchronous router rpc. Please view the
> >>>> issues:
> >>>>>>>>>> https://issues.apache.org/jira/browse/HDFS-17531  for the
> >> complete
> >>>>>>>>>> solution.
> >>>>>>>>>>
> >>>>>>>>>> And you can also view this PR:
> >>>>>>>> https://github.com/apache/hadoop/pull/6838,
> >>>>>>>>>> which is just a demo, but it completes the core asynchronous RPC
> >>>>>>>> function.
> >>>>>>>>>> If you think asynchronous routing is feasible, we can consider
> >>>>>> splitting
> >>>>>>>>>> this PR for easy review in the future.
> >>>>>>>>>>
> >>>>>>>>>> The PDF is attached and can also be viewed through issues.
> >>>>>>>>>>
> >>>>>>>>>> Welcome everyone to exchange and discuss!
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >> ---------------------------------------------------------------------
> >>>>>>>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> >>>>>>>> For additional commands, e-mail:
> common-dev-h...@hadoop.apache.org
> >>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> ---------------------------------------------------------------------
> >>>>>> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> >>>>>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> >>>>>>
> >>>>>>
> >>>>
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> >>>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> >>>>
> >>>>
> >>>
> >>
> >>
> >
>
>

Re: [Discuss] RBF: Aynchronous router RPC.

Reply via email to