[DISCUSS] FLIP-423 ～FLIP-428: Introduce Disaggregated State Storage and Management in Flink 2.0

Yuan Mei Wed, 28 Feb 2024 22:17:11 -0800

Hi Devs,

This is a joint work of Yuan Mei, Zakelly Lan, Jinzhong Li, Hangxiang Yu,
Yanfei Lei and Feng Wang. We'd like to start a discussion about introducing
Disaggregated State Storage and Management in Flink 2.0.


The past decade has witnessed a dramatic shift in Flink's deployment mode,
workload patterns, and hardware improvements. We've moved from the
map-reduce era where workers are computation-storage tightly coupled nodes
to a cloud-native world where containerized deployments on Kubernetes
become standard. To enable Flink's Cloud-Native future, we introduce
Disaggregated State Storage and Management that uses DFS as primary storage
in Flink 2.0, as promised in the Flink 2.0 Roadmap.

Design Details can be found in FLIP-423[1].

This new architecture is aimed to solve the following challenges brought in
the cloud-native era for Flink.
1. Local Disk Constraints in containerization
2. Spiky Resource Usage caused by compaction in the current state model
3. Fast Rescaling for jobs with large states (hundreds of Terabytes)
4. Light and Fast Checkpoint in a native way

More specifically, we want to reach a consensus on the following issues in
this discussion:

1. Overall design
2. Proposed Changes
3. Design details to achieve Milestone1

In M1, we aim to achieve an end-to-end baseline version using DFS as
primary storage and complete core functionalities, including:

- Asynchronous State APIs (FLIP-424)[2]: Introduce new APIs for
asynchronous state access.
- Asynchronous Execution Model (FLIP-425)[3]: Implement a non-blocking
execution model leveraging the asynchronous APIs introduced in FLIP-424.
- Grouping Remote State Access (FLIP-426)[4]: Enable retrieval of remote
state data in batches to avoid unnecessary round-trip costs for remote
access
- Disaggregated State Store (FLIP-427)[5]: Introduce the initial version of
the ForSt disaggregated state store.
- Fault Tolerance/Rescale Integration (FLIP-428)[6]: Integrate
checkpointing mechanisms with the disaggregated state store for fault
tolerance and fast rescaling.

We will vote on each FLIP in separate threads to make sure each FLIP
reaches a consensus. But we want to keep the discussion within a focused
thread (this thread) for easier tracking of contexts to avoid duplicated
questions/discussions and also to think of the problem/solution in a full
picture.

Looking forward to your feedback

Best,
Yuan, Zakelly, Jinzhong, Hangxiang, Yanfei and Feng

[1] https://cwiki.apache.org/confluence/x/R4p3EQ
[2] https://cwiki.apache.org/confluence/x/SYp3EQ
[3] https://cwiki.apache.org/confluence/x/S4p3EQ
[4] https://cwiki.apache.org/confluence/x/TYp3EQ
[5] https://cwiki.apache.org/confluence/x/T4p3EQ
[6] https://cwiki.apache.org/confluence/x/UYp3EQ

[DISCUSS] FLIP-423 ～FLIP-428: Introduce Disaggregated State Storage and Management in Flink 2.0

Reply via email to