good job! On Fri, May 24, 2024 at 1:57 AM zhangjian <1361320...@qq.com> wrote:
> Hello everyone, currently, I have tested the performance of async and sync > router for a downstream ns: > 1. The throughput, CPU, and thread performance of the async router are > better than those of the sync router, and its memory performance is within > an acceptable range compared to the synchronous router. > 2. Asynchronous router can apply pressure downstream to better utilize the > performance of downstream ns, and can almost fill the call queue of > downstream ns. > > Due to the large size of the test result pdf, it cannot be sent via email, > please see: https://issues.apache.org/jira/browse/HDFS-17531 > > > 2024年5月23日 17:03,Xiaoqiao He <hexiaoq...@apache.org> 写道: > > > > Great. Thanks for your addendum information. > > > > cc @Ayush Saxena <ayush...@gmail.com> @inigo...@apache.org > > <inigo...@apache.org> Any more feedback for this proposal? > > > > IMO The feature of asynchronous router RPC is a helpful improvement. For > my > > internal practice, it will improve the throughput of requests forward > > significantly > > and is very valuable to push it forward. > > Thanks again and good luck! > > > > Best Regards, > > - He Xiaoqiao > > > > On Wed, May 22, 2024 at 9:59 AM zhangjian <1361320...@qq.com> wrote: > > > >> Hi, Sangjin Lee, thank you for your attention. I will use my free time > to > >> do a performance comparison recently. > >> > >>> 2024年5月22日 03:42,Sangjin Lee <sj...@apache.org> 写道: > >>> > >>> Thanks for the great proposal, Zhangjian. On point #3, I suspect it > >> should > >>> be fairly straightforward to create a small isolated synthetic test to > >>> prove (or disprove) the benefits of this approach. By driving a > >> controlled > >>> amount of requests per second, you could see latency, memory, CPU, etc. > >>> Ideally, it should show meaningful improvements without much > degradation > >> in > >>> other metrics. Would you be able to spend some time doing that? > >>> > >>> Thanks, > >>> Sangjin > >>> > >>> On Tue, May 21, 2024 at 5:13 AM zhangjian <1361320...@qq.com.invalid> > >> wrote: > >>> > >>>> Hi, xiaoqiao he, thank you for your reply. > >>>> > >>>> 1.Currently, the server and client protocols within router can be > >>>> implemented by extends existing protocols and adding asynchronous > >>>> functionality, so it will not affect existing synchronization > protocols. > >>>> RouterClientNamenodeProtocolServerSideTranslatorPB > >>>> RouterClientProtocolTranslatorPB > >>>> RouterGetUserMappingsProtocolServerSideTranslatorPB > >>>> RouterGetUserMappingsProtocolTranslatorPB > >>>> RouterNamenodeProtocolServerSideTranslatorPB > >>>> RouterNamenodeProtocolTranslatorPB > >>>> RouterRefreshUserMappingsProtocolServerSideTranslatorPB > >>>> RouterRefreshUserMappingsProtocolTranslatorPB > >>>> > >>>> The following issues have implemented asynchronous callbacks for > >>>> Rpc.server, but I have not found any other modules to use related > >> functions > >>>> Server HADOOP-11552 HADOOP-17046 > >>>> In the implementation of asynchronous Rpc.client, this issue is > directly > >>>> used > >>>> Client HADOOP-13226 > >>>> Therefore, I believe that asynchronous routers are safe for modifying > >> the > >>>> RPC protocol, RPC server, and client > >>>> > >>>> 2. Forwarding requests to multiple downstream ns, the synchronous > router > >>>> handler adds requests from multiple downstream ns to the thread pool > >>>> (RouterRpcClient.executorService), and then waits for responses from > all > >>>> downstream ns before returning. Since threads in the thread pool also > >>>> process rpc requests synchronously, similar to a handler, the number > of > >>>> threads in the thread pool directly affects the performance of > >>>> invoiceConcurrent, which in turn affects the performance of the > handler. > >>>> In asynchronous router implementation, the handler calls > >> invoiceConcurrent > >>>> to simply convert a request into multiple requests and add them to the > >> asyn > >>>> handler thread pool, which can then process the next request in the > call > >>>> queue; When a connection thread of a downstream ns receives a > response, > >> it > >>>> will hand it over to the async response for processing. The async > >> response > >>>> thread will determine whether it has received all responses from the > >>>> downstream ns. If it does, it will continue to process the response. > >>>> Otherwise, the async response thread will process the next response. > The > >>>> asynchronous router uses CompletableFuture.allOf() to implement > >>>> asynchronous invoiceConcurrent, and the handler, async handler, async > >>>> response, and connection thread still does not need to wait > >> synchronously. > >>>> In addition, synchronous routers not only have drawbacks in multi ns > >>>> environments, but also in single downstream ns situations, it is often > >>>> difficult to decide how many handlers to set for the router, setting > it > >> too > >>>> much will waste thread resources, and setting it too small will not be > >> able > >>>> to give pressure to downstream ns; Asynchronous routers can push > >> requests > >>>> to downstream ns without considering how to set handlers. Asynchronous > >>>> routers can also better connect to more downstream storage services > that > >>>> support the HDFS protocol, with better scalability. > >>>> > >>>> 3.Since I have not yet deployed asynchronous routers to our own > cluster, > >>>> there is no performance comparison. However, theoretically, I believe > >> that > >>>> asynchronous routers will occupy more memory than synchronous routers. > >>>> However, I do not believe that it will occupy a lot, especially since > we > >>>> can control the maximum number of requests entering the router, as > >>>> CompletableFuture is stable and widely used; In other aspects, it > >> should be > >>>> far superior to synchronous routers, especially in downstream > scenarios > >>>> with more ns.If anyone is interested, you can also help to make a > >>>> performance comparison > >>>> > >>>>> 2024年5月21日 11:39,Xiaoqiao He <hexiaoq...@apache.org> 写道: > >>>>> > >>>>> Thanks for this great proposal! > >>>>> > >>>>> Some questions after reviewing the design doc (sorry didn't review PR > >>>>> carefully which is too large.) > >>>>> 1. This solution will involve RPC framework update, will it affect > >> other > >>>>> modules and how to > >>>>> keep other modules off these changes. > >>>>> 2. Some RPC requests should be forward concurrently to all downstream > >> NS, > >>>>> will it cover > >>>>> this case in this solution. > >>>>> 3. Considering there is one init-version implementation, did you > >> collect > >>>>> some benchmark vs > >>>>> the current synchronous model of DFSRouter? > >>>>> Thanks again. > >>>>> > >>>>> Best Regards, > >>>>> - He Xiaoqiao > >>>>> > >>>>> On Tue, May 21, 2024 at 11:21 AM zhangjian <1361320...@qq.com.invalid > > > >>>>> wrote: > >>>>> > >>>>>> Thank you for your positive attitude towards this feature. You can > >> debug > >>>>>> the UTs provided in PR to better understand the current asynchronous > >>>>>> calling function. > >>>>>> > >>>>>>> 2024年5月21日 02:04,Simbarashe Dzinamarira <simbadz...@apache.org> > 写道: > >>>>>>> > >>>>>>> Excited to see this feature as well. I'll spend more time > >> understanding > >>>>>> the > >>>>>>> proposal and implementation. > >>>>>>> > >>>>>>> On Mon, May 20, 2024 at 7:55 AM zhangjian > <1361320...@qq.com.invalid > >>> > >>>>>> wrote: > >>>>>>> > >>>>>>>> Hi, Yuanbo liu, thank you for your interest in this feature, I > >> think > >>>>>> the > >>>>>>>> difficulty of an asynchronous router is not only to implement > >>>>>> asynchronous > >>>>>>>> functions, but also to consider the readability and reusability of > >> the > >>>>>>>> code, so as to facilitate the development of the community. I also > >>>>>> planned > >>>>>>>> to do the virtual thread you mentioned at the beginning, virtual > >>>> Threads > >>>>>>>> can achieve asynchronousization elegantly at the code level, but > the > >>>>>>>> biggest problem is that it is not easy to upgrade the jdk version, > >> no > >>>>>>>> matter in the community or in the actual production environment. > >>>>>> Therefore, > >>>>>>>> I later used CompletableFuture, which is currently supported by > jdk > >> 8, > >>>>>> to > >>>>>>>> achieve asynchronousization. The router is stateless, and the > router > >>>> rpc > >>>>>>>> process is very clear. Therefore, even if CompletableFuture itself > >> is > >>>>>> not > >>>>>>>> as readable as the virtual thread, if we design it well, we can > make > >>>> the > >>>>>>>> asynchronous process look very clear. > >>>>>>>> > >>>>>>>> > >>>>>>>>> 2024年5月20日 10:56,Yuanbo Liu <liuyuanb...@gmail.com> 写道: > >>>>>>>>> > >>>>>>>>> Nice to see this feature brought up. I tried to implement this > >>>> feature > >>>>>> in > >>>>>>>>> our internal clusters, and know that it's a very complicated > >> feature, > >>>>>> CC > >>>>>>>>> hdfs-dev to bring more discussion. > >>>>>>>>> By the way, I'm not sure whether virtual thread of higher jdk > will > >>>> help > >>>>>>>> in > >>>>>>>>> this case. > >>>>>>>>> > >>>>>>>>> On Mon, May 20, 2024 at 10:10 AM zhangjian > >> <1361320...@qq.com.invalid > >>>>> > >>>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>>> Hello everyone, currently there are some shortcomings in the RPC > >> of > >>>>>> HDFS > >>>>>>>>>> router: > >>>>>>>>>> > >>>>>>>>>> Currently the router's handler thread is synchronized, when the > >>>>>>>> *handler* thread > >>>>>>>>>> adds the call to connection.calls, it needs to wait until the > >>>>>>>> *connection* notifies > >>>>>>>>>> the call to complete, and then Only after the response is put > into > >>>> the > >>>>>>>>>> response queue can a new call be obtained from the call queue > and > >>>>>>>>>> processed. Therefore, the concurrency performance of the router > is > >>>>>>>> limited > >>>>>>>>>> by the number of handlers; a simple example is as follows: If > the > >>>>>>>> number of > >>>>>>>>>> handlers is 1 and the maximum number of calls in the connection > >>>> thread > >>>>>>>> is > >>>>>>>>>> 10, then even if the connection thread can send 10 requests to > the > >>>>>>>>>> downstream ns, since the number of handlers is 1, the router can > >>>> only > >>>>>>>>>> process one request after another. > >>>>>>>>>> > >>>>>>>>>> Since the performance of router rpc is mainly limited by the > >> number > >>>> of > >>>>>>>>>> handlers, the most effective way to improve rpc performance > >>>> currently > >>>>>>>> is to > >>>>>>>>>> increase the number of handlers. Letting the router create a > large > >>>>>>>> number > >>>>>>>>>> of handler threads will also increase the number of thread > >> switches > >>>>>> and > >>>>>>>>>> cannot maximize the use of machine performance. > >>>>>>>>>> > >>>>>>>>>> There are usually multiple ns downstream of the router. If the > >>>> handler > >>>>>>>>>> forwards the request to an ns with poor performance, it will > cause > >>>> the > >>>>>>>>>> handler to wait for a long time. Due to the reduction of > available > >>>>>>>>>> handlers, the router's ability to handle ns requests with normal > >>>>>>>>>> performance will be reduced. From the perspective of the client, > >> the > >>>>>>>>>> performance of the downstream ns of the router has deteriorated > at > >>>>>> this > >>>>>>>>>> time. We often find that the call queue of the downstream ns is > >> not > >>>>>>>> high, > >>>>>>>>>> but the call queue of the router is very high. > >>>>>>>>>> > >>>>>>>>>> Therefore, although the main function of the router is to > federate > >>>> and > >>>>>>>>>> handle requests from multiple NSs, the current synchronous RPC > >>>>>>>> performance > >>>>>>>>>> cannot satisfy the scenario where there are many NSs downstream > of > >>>> the > >>>>>>>>>> router. Even if the concurrent performance of the router can be > >>>>>>>> improved by > >>>>>>>>>> increasing the number of handlers, it is still relatively slow. > >> More > >>>>>>>>>> threads will increase the CPU context switching time, and in > fact > >>>> many > >>>>>>>> of > >>>>>>>>>> the handler threads are in a blocked state, which is > undoubtedly a > >>>>>>>> waste of > >>>>>>>>>> thread resources. When a request enters the router, there is no > >>>>>>>> guarantee > >>>>>>>>>> that there will be a running handler at this time. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Therefore, I consider asynchronous router rpc. Please view the > >>>> issues: > >>>>>>>>>> https://issues.apache.org/jira/browse/HDFS-17531 for the > >> complete > >>>>>>>>>> solution. > >>>>>>>>>> > >>>>>>>>>> And you can also view this PR: > >>>>>>>> https://github.com/apache/hadoop/pull/6838, > >>>>>>>>>> which is just a demo, but it completes the core asynchronous RPC > >>>>>>>> function. > >>>>>>>>>> If you think asynchronous routing is feasible, we can consider > >>>>>> splitting > >>>>>>>>>> this PR for easy review in the future. > >>>>>>>>>> > >>>>>>>>>> The PDF is attached and can also be viewed through issues. > >>>>>>>>>> > >>>>>>>>>> Welcome everyone to exchange and discuss! > >>>>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >> --------------------------------------------------------------------- > >>>>>>>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org > >>>>>>>> For additional commands, e-mail: > common-dev-h...@hadoop.apache.org > >>>>>>>> > >>>>>>>> > >>>>>> > >>>>>> > >>>>>> > --------------------------------------------------------------------- > >>>>>> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org > >>>>>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org > >>>>>> > >>>>>> > >>>> > >>>> > >>>> --------------------------------------------------------------------- > >>>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org > >>>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org > >>>> > >>>> > >>> > >> > >> > > > >