[ https://issues.apache.org/jira/browse/FLINK-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Flink Jira Bot reassigned FLINK-9673: ------------------------------------- Assignee: (was: vinoyang) > Improve State efficiency of bounded OVER window operators > --------------------------------------------------------- > > Key: FLINK-9673 > URL: https://issues.apache.org/jira/browse/FLINK-9673 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Legacy Planner > Reporter: Fabian Hueske > Priority: Major > Labels: auto-unassigned > > Currently, the implementations of bounded OVER window aggregations store the > complete input for the bound interval. For example for the query: > {code:java} > SELECT user_id, count(action) OVER (PARTITION BY user_id ORDER BY rowtime > RANGE INTERVAL '14' DAY PRECEDING) action_count, rowtime > FROM > SELECT rowtime, user_id, action, val1, val2, val3, val4 FROM user > {code} > The whole records with schema {{(rowtime, user_id, action, val1, val2, val3, > val4)}} are stored for 14 days in order to retract them after 14 days from > the accumulators. > However, it would be sufficient to only store those fields that are required > for the aggregtions, i.e., {{action}} in the example above. All other fields > could be set to {{null}} and hence significantly reduce the amount of data > that needs to be stored in state. > This improvement can be applied to all four combinations of bounded > [rowtime|proctime] [range|rows] OVER windows. -- This message was sent by Atlassian Jira (v8.3.4#803005)