Hi everyone,

I would like to open a discussion on introducing remote compaction for 
disaggregated state[1].

Flink state backends rely on LSM-Trees for large-scale storage, with file 
compaction executed locally in TaskManager background threads. This co-location 
creates local resource contention, causing latency spikes and resource 
instability.

Flink 2.0 introduces disaggregated state management through the ForSt 
StateBackend[2], employing a shared DFS as primary storage. This allows ForSt 
to implement compaction-as-a-service (Remote Compaction) through dedicated 
compaction workers.

This approach can clearly separate the responsibilities between computing and 
storage nodes, therefore further complement Flink's disaggregated architecture. 
Introducing a compaction service aligns with the pooling concept prevalent in 
the cloud-native era, and can significantly improve the resource efficiency and 
elasticity of Flink stateful jobs.

Looking forward to your comments or feedback. Best regards,
Han Yin

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-430%3A+Remote+Compaction+For+Disaggregated+State
[2] https://cwiki.apache.org/confluence/x/R4p3EQ

Reply via email to