[ https://issues.apache.org/jira/browse/FLINK-30217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17644794#comment-17644794 ]
xljtswf commented on FLINK-30217: --------------------------------- For the KeyedState, I found this will almost always happen in org.apache.flink.streaming.runtime.operators.windowing.MergingWindowSet#persist. Every time, when it comes one element with timestamp later than every element in the Session Window, the mapping will change. > Use ListState#update() to replace clear + add mode. > --------------------------------------------------- > > Key: FLINK-30217 > URL: https://issues.apache.org/jira/browse/FLINK-30217 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends > Reporter: xljtswf > Priority: Major > > When using listState, I found many times we need to clear current state, then > add new values. This is especially common in > CheckpointedFunction#snapshotState, which is slower than just use > ListState#update(). > Suppose we want to update the liststate to contain value1, value2, value3. > With current implementation, we first call Liststate#clear(). this updates > the state 1 time. > then we add value1, value2, value3 to the state. > if we use heap state, we need to search the stateTable 3 times and add 3 > values to the list. > this happens in memory and is not too bad. > if we use rocksdb. then we will call backend.db.merge() 3 times. > finally, we will update the state 4 times. > The more values to be added, the more times we will update the state. > while if we use listState#update. then we just need to update the state 1 > time. I think this can save a lot of time. -- This message was sent by Atlassian Jira (v8.20.10#820010)