Hi Jinzhong, +1 for the FLIP. I have the following comments:
- Do we have a fallback mechanism for filesystems that do not support multiget? - Also, in the case of multiget what is the granularity of error handling or retry semantics (e.g., one subrequest fails in multiget). Do we fully rely on RocksDB's semantics in this case? - From my understanding AEC and especially its grouping component performs computationally expensive operations. Basically, operators offload async requests to AEC, and it performs serialization/ordering/optimizations on these requests before submitting them. How do we plan to separate/allocate the resource utilization of this component? Because AEC receives (potentially many) requests from operators, it can easily be a bottleneck to the whole pipeline. Regards, Jeyhun On Thu, Mar 7, 2024 at 9:53 AM Jinzhong Li <lijinzhong2...@gmail.com> wrote: > Hi devs, > > > I'd like to start a discussion on a sub-FLIP of FLIP-423: Disaggregated > State Storage and Management[1], which is a joint work of Yuan Mei, Zakelly > Lan, Jinzhong Li, Hangxiang Yu, Yanfei Lei and Feng Wang: > > - FLIP-426: Grouping Remote State Access > < > https://cwiki.apache.org/confluence/display/FLINK/FLIP-426%3A+Grouping+Remote+State+Access > > > [2] > > This FLIP enables retrieval of remote state data in batches to avoid > unnecessary round-trip costs for remote access. > > Please make sure you have read the FLIP-423[1] to know the whole story, and > we'll discuss the details of FLIP-424[2] under this mail. For the > discussion of overall architecture or topics related with multiple > sub-FLIPs, please post in the previous mail[3]. > > Looking forward to hearing from you! > > [1] https://cwiki.apache.org/confluence/x/R4p3EQ > > [2] https://cwiki.apache.org/confluence/x/TYp3EQ > > [3] https://lists.apache.org/thread/ct8smn6g9y0b8730z7rp9zfpnwmj8vf0 > > Best, > > Jinzhong Li >