Yangze Guo created FLINK-11172:
----------------------------------
Summary: Remove the max retention time in StreamQueryConfig
Key: FLINK-11172
URL: https://issues.apache.org/jira/browse/FLINK-11172
Project: Flink
Issue Type: Improvement
Components: Table API & SQL
Affects Versions: 1.8.0
Reporter: Yangze Guo
Assignee: Yangze Guo
[Stream Query
Config|https://ci.apache.org/projects/flink/flink-docs-master/dev/table/streaming/query_configuration.html]
is an important and useful feature to make a tradeoff between accuracy and
resource consumption when some query executed in unbounded streaming data. This
feature first proposed in
[FLINK-6491|https://issues.apache.org/jira/browse/FLINK-6491].
At the first, *QueryConfig* take two parameters, i.e. minIdleStateRetentionTime
and maxIdleStateRetentionTime, to avoid to register many timers if we have more
freedom when to discard state. However, this approach may cause new data
expired earlier than old data and thus greater accuracy loss appeared in some
case. For example, we have an unbounded keyed streaming data. We process key
*_a_* in _*t0*_ and _*b*_ in _*t1,*_ *_t0 < t1_*. *_a_* will expired in
_*a+maxIdleStateRetentionTime*_ while _*b*_ expired in
*_b+maxIdleStateRetentionTime_*. Now, another data with key *_a_* arrived in
_*t2 (t1 < t2)*_. But _*t2+minIdleStateRetentionTime*_ <
_*a+maxIdleStateRetentionTime*_. The state of key *_a_* will still be expired
in _*a+maxIdleStateRetentionTime*_ which is early than the state of key _*b*_.
According to the guideline of
[LRU|https://en.wikipedia.org/wiki/Cache_replacement_policies#Least_recently_used_(LRU)]
that the element has been most heavily used in the past few instructions are
most likely to be used heavily in the next few instructions too. The state with
key _*a*_ should live longer than the state with key _*b*_. Current approach
against this idea.
I think we now have a good chance to remove the maxIdleStateRetentionTime
argument in *StreamQueryConfig.* Below are my reasons.
* [FLINK-9423|https://issues.apache.org/jira/browse/FLINK-9423] implement
efficient deletes for heap-based timer service. We can leverage the deletion op
to mitigate the abuse of timer registration.
* Current approach can cause new data expired earlier than old data and thus
greater accuracy loss appeared in some case. Users need to fine-tune these two
parameter to avoid this scenario. Directly following the idea of LRU looks like
a better solution.
So, I plan to remove maxIdleStateRetentionTime, update the expire time only
depends on _*minIdleStateRetentionTime.*_
cc to [~sunjincheng121], [~fhueske]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)