[ https://issues.apache.org/jira/browse/FLINK-27504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yun Tang closed FLINK-27504. ---------------------------- Resolution: Information Provided > State compaction not happening with sliding window and incremental RocksDB > backend > ---------------------------------------------------------------------------------- > > Key: FLINK-27504 > URL: https://issues.apache.org/jira/browse/FLINK-27504 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends > Affects Versions: 1.14.4 > Environment: Local Flink cluster on Arch Linux. > Reporter: Alexis Sarda-Espinosa > Priority: Major > Attachments: duration_trend_52ca77c.png, duration_trend_67c76bb.png, > duration_trend_c5dd5d2.png, image-2022-05-06-10-34-35-007.png, > size_growth_52ca77c.png, size_growth_67c76bb.png, size_growth_c5dd5d2.png > > > Hello, > I'm trying to estimate an upper bound for RocksDB's state size in my > application. For that purpose, I have created a small job with faster timings > whose code you can find on GitHub: > [https://github.com/asardaes/flink-rocksdb-ttl-test]. You can see some of the > results there, but I summarize here as well: > * Approximately 20 events per second, 10 unique keys for partitioning are > pre-specified. > * Sliding window of 11 seconds with a 1-second slide. > * Allowed lateness of 11 seconds. > * State TTL configured to 1 minute and compaction after 1000 entries. > * Both window-specific and window-global state used. > * Checkpoints every 2 seconds. > * Parallelism of 4 in stateful tasks. > The goal is to let the job run and analyze state compaction behavior with > RocksDB. I should note that global state is cleaned manually inside the > functions, TTL for those is in case some keys are no longer seen in the > actual production environment. > I have been running the job on a local cluster (outside IDE), the > configuration YAML is also available in the repository. After running for > approximately 1.6 days, state size is currently 2.3 GiB (see attachments). I > understand state can retain expired data for a while, but since TTL is 1 > minute, this seems excessive to me. -- This message was sent by Atlassian Jira (v8.20.7#820007)