[jira] [Updated] (FLINK-37069) Cross-team verification for "Disaggregated State Management"

Zakelly Lan (Jira) Sun, 09 Feb 2025 19:55:30 -0800


     [ 
https://issues.apache.org/jira/browse/FLINK-37069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Zakelly Lan updated FLINK-37069:
--------------------------------
    Description: 
Instructions:

First of all, please read the related documents briefly (still under review, 
will replace with formal links if merged):
 * Disaggregated State Management: 
[https://github.com/apache/flink/pull/26107/files#diff-bfa19e04bb5c3487c3e9bf514d61c0fa8bb973950fb0ad0e3d4a6898a99b83e3]
 * State V2: 
[https://github.com/apache/flink/pull/26107/files#diff-5d1147987fecbda329132403c1d92384575be220092995c4be491e12b8c50cc9]
 * ForSt State Backend: 
[https://github.com/apache/flink/pull/26107/files#diff-b7c52c06f6ed4d5af6f230d11ba23ea051bf4a08c589d98392143f080c468a87]

For the SQL part, verification goes in FLINK-37068, we mainly focus on 
Datastream jobs and APIs here.

1. Make sure you are verifying this on release-2.0 branch, since we have fixed 
several bugs since the rc0 package.
2. Choose one example in `flink-examples-streaming`. Most of the jobs has been 
rewritten using new API. Here we take `StateMachineExample` as an example.
3. Compile and run `StateMachineExample` in proper environment (I suggest a 
standalone session cluster or yarn), make sure you have the following command 
line params:
{code:bash}
./flink run xxxxxxxxx \
  --backend forst \
  --checkpoint-dir s3://your/cp/dir \
  --incremental-checkpoints true
{code}
Or set via `config.yaml`.
{code:yaml}
state.backend.type: forst
execution.checkpointing.incremental: true
execution.checkpointing.dir: s3://your-bucket/flink-checkpoints
{code}
4. Check the job is running smoothly, the periodic checkpoints are successfully 
taken.
5. Stop the job and restart from the latest checkpoint.

It would be great if you could write your own job using State V2 API, and 
follow the above Step 3~5. It is important to check whether there is any bug in 
new State APIs.

  was:
Instructions:

First of all, please read the related documents briefly (still under review, 
will replace with formal links if merged):
* Disaggregated State Management: 
https://github.com/apache/flink/pull/26107/files#diff-bfa19e04bb5c3487c3e9bf514d61c0fa8bb973950fb0ad0e3d4a6898a99b83e3
* State V2: 
https://github.com/apache/flink/pull/26107/files#diff-5d1147987fecbda329132403c1d92384575be220092995c4be491e12b8c50cc9
* ForSt State Backend: 
https://github.com/apache/flink/pull/26107/files#diff-b7c52c06f6ed4d5af6f230d11ba23ea051bf4a08c589d98392143f080c468a87

For the SQL part, verification goes in FLINK-37068, we mainly focus on 
Datastream jobs and APIs here.

1. Make sure you are verifying this on master branch, since we have fixed 
several bugs since the rc0 package.
2. Choose one example in `flink-examples-streaming`. Most of the jobs has been 
rewritten using new API. Here we take `StateMachineExample` as an example.
3. Compile and run `StateMachineExample` in proper environment (I suggest a 
standalone session cluster or yarn), make sure you have the following command 
line params:
{code:bash}
./flink run xxxxxxxxx \
  --backend forst \
  --checkpoint-dir s3://your/cp/dir \
  --incremental-checkpoints true
{code}
Or set via `config.yaml`.
{code:yaml}
state.backend.type: forst
execution.checkpointing.incremental: true
execution.checkpointing.dir: s3://your-bucket/flink-checkpoints
{code}

4. Check the job is running smoothly, the periodic checkpoints are successfully 
taken.
5. Stop the job and restart from the latest checkpoint.

It would be great if you could write your own job using State V2 API, and 
follow the above Step 3~5. It is important to check whether there is any bug in 
new State APIs.



> Cross-team verification for "Disaggregated State Management"
> ------------------------------------------------------------
>
>                 Key: FLINK-37069
>                 URL: https://issues.apache.org/jira/browse/FLINK-37069
>             Project: Flink
>          Issue Type: Sub-task
>            Reporter: Xintong Song
>            Assignee: Zakelly Lan
>            Priority: Blocker
>             Fix For: 2.0.0
>
>
> Instructions:
> First of all, please read the related documents briefly (still under review, 
> will replace with formal links if merged):
>  * Disaggregated State Management: 
> [https://github.com/apache/flink/pull/26107/files#diff-bfa19e04bb5c3487c3e9bf514d61c0fa8bb973950fb0ad0e3d4a6898a99b83e3]
>  * State V2: 
> [https://github.com/apache/flink/pull/26107/files#diff-5d1147987fecbda329132403c1d92384575be220092995c4be491e12b8c50cc9]
>  * ForSt State Backend: 
> [https://github.com/apache/flink/pull/26107/files#diff-b7c52c06f6ed4d5af6f230d11ba23ea051bf4a08c589d98392143f080c468a87]
> For the SQL part, verification goes in FLINK-37068, we mainly focus on 
> Datastream jobs and APIs here.
> 1. Make sure you are verifying this on release-2.0 branch, since we have 
> fixed several bugs since the rc0 package.
> 2. Choose one example in `flink-examples-streaming`. Most of the jobs has 
> been rewritten using new API. Here we take `StateMachineExample` as an 
> example.
> 3. Compile and run `StateMachineExample` in proper environment (I suggest a 
> standalone session cluster or yarn), make sure you have the following command 
> line params:
> {code:bash}
> ./flink run xxxxxxxxx \
>   --backend forst \
>   --checkpoint-dir s3://your/cp/dir \
>   --incremental-checkpoints true
> {code}
> Or set via `config.yaml`.
> {code:yaml}
> state.backend.type: forst
> execution.checkpointing.incremental: true
> execution.checkpointing.dir: s3://your-bucket/flink-checkpoints
> {code}
> 4. Check the job is running smoothly, the periodic checkpoints are 
> successfully taken.
> 5. Stop the job and restart from the latest checkpoint.
> It would be great if you could write your own job using State V2 API, and 
> follow the above Step 3~5. It is important to check whether there is any bug 
> in new State APIs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-37069) Cross-team verification for "Disaggregated State Management"

Reply via email to