[ https://issues.apache.org/jira/browse/FLINK-12692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17096721#comment-17096721 ]
Yu Li commented on FLINK-12692: ------------------------------- The preview version has been uploaded onto flink-packages as promised, please check it out and let us know if any feedback. Thanks: https://flink-packages.org/packages/spillable-state-backend-for-flink > Support disk spilling in HeapKeyedStateBackend > ---------------------------------------------- > > Key: FLINK-12692 > URL: https://issues.apache.org/jira/browse/FLINK-12692 > Project: Flink > Issue Type: New Feature > Components: Runtime / State Backends > Reporter: Yu Li > Assignee: Yu Li > Priority: Major > Labels: pull-request-available > Fix For: 1.11.0 > > Time Spent: 10m > Remaining Estimate: 0h > > {{HeapKeyedStateBackend}} is one of the two {{KeyedStateBackends}} in Flink, > since state lives as Java objects on the heap and the de/serialization only > happens during state snapshot and restore, it outperforms > {{RocksDBKeyedStateBackend}} when all data could reside in memory. > However, along with the advantage, {{HeapKeyedStateBackend}} also has its > shortcomings, and the most painful one is the difficulty to estimate the > maximum heap size (Xmx) to set, and we will suffer from GC impact once the > heap memory is not enough to hold all state data. There’re several > (inevitable) causes for such scenario, including (but not limited to): > * Memory overhead of Java object representation (tens of times of the > serialized data size). > * Data flood caused by burst traffic. > * Data accumulation caused by source malfunction. > To resolve this problem, we propose a solution to support spilling state data > to disk before heap memory is exhausted. We will monitor the heap usage and > choose the coldest data to spill, and reload them when heap memory is > regained after data removing or TTL expiration, automatically. > More details please refer to the design doc and mailing list discussion. -- This message was sent by Atlassian Jira (v8.3.4#803005)