[ 
https://issues.apache.org/jira/browse/FLINK-35853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Lee updated FLINK-35853:
------------------------------
    Description: 
We have a job with small and static state size (states are updated instead of 
added), the job is configured to use RocksDB + full checkpointng (incremental 
disabled) because the diff between checkpoint is larger than full checkpoint 
size.

After migrating to 1.18, we observed significant and steady increase in full 
checkpoint size with RocksDB + full checkpointing. The increase was not 
observed with hashmap state backend.

I managed to reproduce the issue with following code:

[^StaticStateSizeGenerator115.java]
[^StaticStateSizeGenerator118.java]

Result:

On Flink 1.15, RocksDB + full checkpointing, checkpoint size is constant at 
250KiB.
On Flink 1.18, RocksDB + full checkpointing, max checkpoint size got up to 
38MiB before dropping (presumably due to compaction?)
On Flink 1.18, Hashmap statebackend, checkpoint size is constant at 219KiB.

Notes:

One observation I have is that the issue is more pronounced with higher 
parallelism, the code uses 8 parallelism. The production application that we 
first saw the regression got up to GiB of checkpoint size, where only expected 
and observed (in 1.15) at most a couple of MiB.

  was:
We have an job with small and static state size (states are updated instead of 
added), the job is configured to use RocksDB + full checkpointng (incremental 
disabled) because the diff between checkpoint is larger than full checkpoint 
size. 

After migrating to 1.18, we observed significant and steady increase in full 
checkpoint size with RocksDB + full checkpointing. The increase was not 
observed with hashmap state backend.

I managed to reproduce the issue with following code:

[^StaticStateSizeGenerator115.java]
[^StaticStateSizeGenerator118.java]

Result:

On Flink 1.15, RocksDB + full checkpointing, checkpoint size is constant at 
250KiB.
On Flink 1.18, RocksDB + full checkpointing, max checkpoint size got up to 
38MiB before dropping (presumably due to compaction?)
On Flink 1.18, Hashmap statebackend, checkpoint size is constant at 219KiB.

Notes:

One observation I have is that the issue is more pronounced with higher 
parallelism, the code uses 8 parallelism. The production application that we 
first saw the regression got up to GiB of checkpoint size, where only expected 
and observed (in 1.15) at most a couple of MiB.


> Regression in checkpoint size when performing full checkpointing in RocksDB
> ---------------------------------------------------------------------------
>
>                 Key: FLINK-35853
>                 URL: https://issues.apache.org/jira/browse/FLINK-35853
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / State Backends
>    Affects Versions: 1.18.1
>         Environment: amazon-linux-2023
>            Reporter: Keith Lee
>            Priority: Major
>         Attachments: StaticStateSizeGenerator115.java, 
> StaticStateSizeGenerator118.java
>
>
> We have a job with small and static state size (states are updated instead of 
> added), the job is configured to use RocksDB + full checkpointng (incremental 
> disabled) because the diff between checkpoint is larger than full checkpoint 
> size.
> After migrating to 1.18, we observed significant and steady increase in full 
> checkpoint size with RocksDB + full checkpointing. The increase was not 
> observed with hashmap state backend.
> I managed to reproduce the issue with following code:
> [^StaticStateSizeGenerator115.java]
> [^StaticStateSizeGenerator118.java]
> Result:
> On Flink 1.15, RocksDB + full checkpointing, checkpoint size is constant at 
> 250KiB.
> On Flink 1.18, RocksDB + full checkpointing, max checkpoint size got up to 
> 38MiB before dropping (presumably due to compaction?)
> On Flink 1.18, Hashmap statebackend, checkpoint size is constant at 219KiB.
> Notes:
> One observation I have is that the issue is more pronounced with higher 
> parallelism, the code uses 8 parallelism. The production application that we 
> first saw the regression got up to GiB of checkpoint size, where only 
> expected and observed (in 1.15) at most a couple of MiB.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to