Hi Stefan, Thank you for the confirmation.
Doing a one time cleanup with full snapshot and upgrading to Flink 1.8 could work. However, in our case, the state is quite large (TBs). Taking a savepoint takes over an hour, during which we have to pause the job or it may process more events. The JavaDoc of `cleanupFullSnapshot` [1] says "Cleanup expired state in full snapshot on checkpoint.". My understanding is that the only way to take a full snapshot with RocksDB backend is to take a savepoint. Is there another way to take a full checkpoint? I noticed that Flink 1.8 also added an incremental cleanup strategy [2] by iterating through several keys at a time for each state access. If I combine this with the new compaction filter cleanup strategy, will it eventually remove all expired state without taking a full snapshot for upgrade? [1] https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/api/common/state/StateTtlConfig.Builder.html#cleanupFullSnapshot-- [2] https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/api/common/state/StateTtlConfig.Builder.html#cleanupIncrementally-int-boolean- Thanks, Ning On Wed, Mar 13, 2019 at 11:22 AM Stefan Richter <s.rich...@ververica.com> wrote: > > Hi, > > If you are worried about old state, you can combine the compaction filter > based TTL with other cleanup strategies (see docs). For example, setting > `cleanupFullSnapshot` when you take a savepoint it will be cleared of any > expired state and you can then use it to bring it into Flink 1.8. > > Best, > Stefan