hi Jinzhong Thanks for your reply The reason why I mentioned this point is because according to the official Rocksdb documentation https://rocksdb.org/blog/2022/10/07/asynchronous-io-in-rocksdb.html. if we turn on async_io and use multiGet, it can improve the performance of point look upc by 100%. Moreover, especially in Flink SQL tasks, there are many ways to access state through mini batch, so I believe this feature also greatly optimizes the synchronous access method and is worth doing. If we first support batch access for asynchronous models, I think it would be okay. My point is, should we consider whether it can be easily extended if we support synchronous models in the future
Jinzhong Li <lijinzhong2...@gmail.com> 于2024年3月19日周二 20:59写道: > Hi Yue, > > Thanks for your feedback! > > > 1. Does Grouping Remote State Access only support asynchronous > interfaces? > > --If it is: IIUC, MultiGet can also greatly improve performance for > > synchronous access modes. Do we need to support it ? > > Yes. If we want to support MultiGet on existing synchronous access mode, we > have to introduce a grouping component akin to the AEC described in > FLIP-425[1]. > I think such a change would introduce additional complexity to the current > synchronous model, and the extent of performance gains remains uncertain. > Therefore, I recommend only asynchronous interfaces support "Grouping > Remote State Access", which is designed to efficiently minimize latency in > accessing remote state storage. > > > 2. Can a simple example be added to FLip on how to use Batch to access > > states and obtain the results of states on the API? > > Sure. I have added a code example in the Flip[2]. Note that the multiget in > this Flip is an internal interface, not a user-facing interface. > > > 3. I also agree with XiaoRui's viewpoint. Is there a corresponding Config > > to control the state access batch strategy? > > Yes, we would offer some configurable options that allow users to adjust > the behavior of batching and grouping state access (eg. batching size, > etc.). > > [1] > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-425%3A+Asynchronous+Execution+Model > [2] > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-426%3A+Grouping+Remote+State+Access#FLIP426:GroupingRemoteStateAccess-CodeExampleonHowtoAccessStateUsingBatch > > Best, > Jinzhong Li > > > On Tue, Mar 19, 2024 at 5:52 PM yue ma <mayuefi...@gmail.com> wrote: > > > Hi Jinzhong, > > > > Thanks for the FLIP. I have the following questions: > > > > 1. Does Grouping Remote State Access only support asynchronous > interfaces? > > --If it is: IIUC, MultiGet can also greatly improve performance for > > synchronous access modes. Do we need to support it ? > > --If not, how can we distinguish between using Grouping State Access > in > > asynchronous and synchronous modes? > > 2. Can a simple example be added to FLip on how to use Batch to access > > states and obtain the results of states on the API? > > 3. I also agree with XiaoRui's viewpoint. Is there a corresponding Config > > to control the state access batch strategy? > > > > -- > > Best, > > Yue > > > -- Best, Yue