回复: [DISCUSS] FLIP-426: Grouping Remote State Access

夏瑞 Thu, 14 Mar 2024 21:08:49 -0700

Hi Jinzhong,

Batching state access is a reasonable way to reduce the amount of I/O compared 
to per-record state access. But I have some questions:


- In my opinion, we need to reduce the times of fetching RocksDB SST from 
remote to local. The FLIP seems to batch the RocksDB put/get requests. I am not 
sure this will reduce the SST fetching times.

- How to monitor the I/O used by state disaggregation? The latency/amount of 
I/O on DFS is important to the performance diagnosis. Moreover, the amount of 
I/O also influences the stability of the DFS. For example, the throughput of 
HDFS NameNode is hard to scale and suffers from I/O flood.

- How about customizing batching strategy? Intuitively, an extremely large 
batch may need lots of memory to hold the returned results, and causes OOM. On 
the other size, if we randomly batch keys, state storage may navigate all SSTs 
to find the results.

Best wishes,
Rui Xia.

________________________________
发件人: Hangxiang Yu <master...@gmail.com>
发送时间: 2024年3月15日 4:04
收件人: xiarui0...@hotmail.com <xiarui0...@hotmail.com>
主题: Fwd: [DISCUSS] FLIP-426: Grouping Remote State Access



---------- Forwarded message ---------
From: Jinzhong Li <lijinzhong2...@gmail.com<mailto:lijinzhong2...@gmail.com>>
Date: Thu, Mar 7, 2024 at 4:52 PM
Subject: [DISCUSS] FLIP-426: Grouping Remote State Access
To: <dev@flink.apache.org<mailto:dev@flink.apache.org>>
Cc: <yuanmei.w...@gmail.com<mailto:yuanmei.w...@gmail.com>>, 
<zakelly....@gmail.com<mailto:zakelly....@gmail.com>>, 
<master...@gmail.com<mailto:master...@gmail.com>>, 
<fredia...@gmail.com<mailto:fredia...@gmail.com>>, 
<fengw...@apache.org<mailto:fengw...@apache.org>>



Hi devs,


I'd like to start a discussion on a sub-FLIP of FLIP-423: Disaggregated State 
Storage and Management[1], which is a joint work of Yuan Mei, Zakelly Lan, 
Jinzhong Li, Hangxiang Yu, Yanfei Lei and Feng Wang:

- FLIP-426: Grouping Remote State 
Access<https://cwiki.apache.org/confluence/display/FLINK/FLIP-426%3A+Grouping+Remote+State+Access>
 [2]

This FLIP enables retrieval of remote state data in batches to avoid 
unnecessary round-trip costs for remote access.

Please make sure you have read the FLIP-423[1] to know the whole story, and 
we'll discuss the details of FLIP-424[2] under this mail. For the discussion of 
overall architecture or topics related with multiple sub-FLIPs, please post in 
the previous mail[3].

Looking forward to hearing from you!

[1] https://cwiki.apache.org/confluence/x/R4p3EQ

[2] https://cwiki.apache.org/confluence/x/TYp3EQ

[3] https://lists.apache.org/thread/ct8smn6g9y0b8730z7rp9zfpnwmj8vf0

Best,

Jinzhong Li




--
Best,
Hangxiang.

回复: [DISCUSS] FLIP-426: Grouping Remote State Access

Reply via email to