I see... The way we run our setup is that we run these in a kubernetes cluster where we have one cluster running one job. The total parallelism of the whole cluster is equal to the number of taskmanagers where each task manager has 1 core cpu accounting for 1 slot. If we add a state ttl, do you have any recommendation as to how much I should bump the cpu per task manager? 2 cores per task manager with 1 slot per task manager ( and the other cpu core will be used for TTL state cleanup? ). Or is that overkill?
On Thu, Mar 19, 2020 at 12:56 PM Andrey Zagrebin <azagre...@apache.org> wrote: > Hi Matt, > > Generally speaking, using state with TTL in Flink should not differ a lot > from just using Flink with state [1]. > You have to provision your system so that it can keep the state of size > which is worth of 7 days. > > The existing Flink state backends provide background cleanup to > automatically remove the expired state eventually, > so that your application does not need to do any explicit access of the > expired state to clean it. > The background cleanup is active by default since Flink 1.10 [2]. > > Enabling TTL for state, of course, comes for price because you need to > store timestamp and spend CPU cycles for the background cleanup. > This affects storage size and potentially processing latency per record. > You can read about details and caveats in the docs: for heap state [3] and > RocksDB [4]. > > Best, > Andrey > > [1] > https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/stream/state/state.html#state-time-to-live-ttl > [2] > https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/stream/state/state.html#cleanup-of-expired-state > [3] > https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/stream/state/state.html#incremental-cleanup > [4] > https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/stream/state/state.html#cleanup-during-rocksdb-compaction > > On Thu, Mar 19, 2020 at 6:48 PM Matt Magsombol <raffy4...@gmail.com> > wrote: > >> Suppose I'm using state stored in-memory that has a TTL of 7 days max. >> Should I run into any issues with state this long other than potential OOM? >> >> Let's suppose I extend this such that we add rocksdb...any concerns with >> this with respect to maintenance? >> >> Most of the examples that I've been seeing seem to pair state with >> timewindows but I'll only read from this state every 15 seconds ( or some >> small timewindow ). After each timewindow, I *won't* be cleaning up the >> data within the state b/c I'll need to re-lookup from this state on future >> time windows. I'll effectively rely on TTL based on key expiration time and >> I was wondering what potential issues I should watch out for this. >> >