[ https://issues.apache.org/jira/browse/FLINK-9637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568175#comment-16568175 ]
ASF GitHub Bot commented on FLINK-9637: --------------------------------------- azagrebin commented on a change in pull request #6379: [FLINK-9637] Add public user documentation for state TTL feature URL: https://github.com/apache/flink/pull/6379#discussion_r207536863 ########## File path: docs/dev/stream/state/state.md ########## @@ -266,6 +266,92 @@ a `ValueState`. Once the count reaches 2 it will emit the average and clear the we start over from `0`. Note that this would keep a different state value for each different input key if we had tuples with different values in the first field. +### State time-to-live (TTL) + +A time-to-live (TTL) can be assigned to the keyed state value. +In this case it will expire after the configured TTL +and its stored value will be cleaned up based on the best effort. +Depending on configuration, the expired state can become unavailable for read access +even if it is not cleaned up yet. In this case it behaves as if it does not exist any more. + +The collection types of state support TTL on entry level: +separate list elements and map entries expire independently. + +The behaviour of state with TTL firstly should be configured by building `StateTtlConfiguration`: + +<div class="codetabs" markdown="1"> +<div data-lang="java" markdown="1"> +{% highlight java %} +StateTtlConfiguration ttlConfig = StateTtlConfiguration + .newBuilder(Time.seconds(1)) + .setTtlUpdateType(StateTtlConfiguration.TtlUpdateType.OnCreateAndWrite) + .setStateVisibility(StateTtlConfiguration.TtlStateVisibility.NeverReturnExpired) + .build(); +{% endhighlight %} +</div> + +<div data-lang="scala" markdown="1"> +{% highlight scala %} +val ttlConfig = StateTtlConfiguration + .newBuilder(Time.seconds(1)) + .setTtlUpdateType(StateTtlConfiguration.TtlUpdateType.OnCreateAndWrite) + .setStateVisibility(StateTtlConfiguration.TtlStateVisibility.NeverReturnExpired) + .build() +{% endhighlight %} +</div> +</div> + +It has several options to consider. +The first parameter of `newBuilder` method is mandatory, it is a value of time-to-live itself. + +The update type configures when the time-to-live of state value is prolonged (default `OnCreateAndWrite`): + + - `StateTtlConfiguration.TtlUpdateType.OnCreateAndWrite` - only on creation and write access, + - `StateTtlConfiguration.TtlUpdateType.OnReadAndWrite` - also on read access. + +The state visibility configures whether the expired value is returned on read access +if it is not cleaned up yet (default `NeverReturnExpired`): + + - `StateTtlConfiguration.TtlStateVisibility.NeverReturnExpired` - expired value is never returned, + - `StateTtlConfiguration.TtlStateVisibility.ReturnExpiredIfNotCleanedUp` - returned if still available. + +The TTL can be enabled in descriptor for any type of state: + +<div class="codetabs" markdown="1"> +<div data-lang="java" markdown="1"> +{% highlight java %} +StateTtlConfiguration ttlConfig = StateTtlConfiguration.newBuilder(Time.seconds(1)).build(); +ValueStateDescriptor<String> stateDescriptor = new ValueStateDescriptor<>("text state", String.class); +stateDescriptor.enableTimeToLive(ttlConfig); +{% endhighlight %} +</div> + +<div data-lang="scala" markdown="1"> +{% highlight scala %} +val ttlConfig = StateTtlConfiguration.newBuilder(Time.seconds(1)).build() +val stateDescriptor = new ValueStateDescriptor[String]("text state", classOf[String]) +stateDescriptor.enableTimeToLive(ttlConfig) +{% endhighlight %} +</div> +</div> + +**Notes:** + +- The state backends store the timestamp of last modification along with the user value, +which means that enabling this feature increases consumption of state storage. + +- As of current implementation the state storage is cleaned up of expired value +only on its explicit read access per key, e.g. calling `ValueState.value()`. +This might change in future releases, e.g. additional strategies might be added in background to speed up cleanup. Review comment: I think we still want to improve this, because FLINK-9938 is limited in a way that it cleans up only the snapshot data but not the local one used by running pipeline atm. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add public user documentation for TTL feature > --------------------------------------------- > > Key: FLINK-9637 > URL: https://issues.apache.org/jira/browse/FLINK-9637 > Project: Flink > Issue Type: Sub-task > Components: State Backends, Checkpointing > Affects Versions: 1.6.0 > Reporter: Andrey Zagrebin > Assignee: Andrey Zagrebin > Priority: Major > Labels: pull-request-available > Fix For: 1.6.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)