[ https://issues.apache.org/jira/browse/FLINK-11517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Till Rohrmann updated FLINK-11517: ---------------------------------- Component/s: Local Runtime > Inefficient window state access when using RocksDB state backend > ---------------------------------------------------------------- > > Key: FLINK-11517 > URL: https://issues.apache.org/jira/browse/FLINK-11517 > Project: Flink > Issue Type: Bug > Components: Local Runtime > Reporter: Elias Levy > Priority: Major > > When using an aggregate function on a window with a process function and the > RocksDB state backend, state access is inefficient. > The WindowOperator calls windowState.add to merge the new element using the > aggregate function. The add method of RocksDBAggregatingState will read the > state, deserialize the state, call the aggregate function, deserialize the > state, and write it out. > If the trigger decides the window must be fired, as the the windowState.add > does not return the state, the WindowOperator must call windowState.get to > get it and pass it to the window process function, resulting in another read > and deserialization. > Finally, while the state is not passed in to the trigger, in some cases the > trigger may have a need to access the state. That is our case. As the state > is not passed to the trigger, we must read and deserialize the state one more > from within the trigger. > Thus, state must be read and deserialized three times to process a single > element. If the state is large, this can be quite costly. > > Ideally windowState.add would return the state, so that the WindowOperator > can pass it to the process function without having to read it again. > Additionally, the state would be made available to the trigger to enable more > use cases without having to go through the state descriptor again. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)