[ https://issues.apache.org/jira/browse/FLINK-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16360123#comment-16360123 ]
ASF GitHub Bot commented on FLINK-8297: --------------------------------------- Github user je-ik commented on the issue: https://github.com/apache/flink/pull/5185 @aljoscha I (partly) reworked this PR as you suggest. There are still some unresolved questions though: 1) I'm not 100% sure how to cleanly support the migration between list state savepoints, would you have any pointers on how should I address this? 2) I didn't test the new version on actual flink job yet, it just passes tests I think there will be some more modifications needed, so I will test this on real data when there is agreement on the actual implementation. Thanks in advance for any comments! > RocksDBListState stores whole list in single byte[] > --------------------------------------------------- > > Key: FLINK-8297 > URL: https://issues.apache.org/jira/browse/FLINK-8297 > Project: Flink > Issue Type: Improvement > Components: Core > Affects Versions: 1.4.0, 1.3.2 > Reporter: Jan Lukavský > Priority: Major > > RocksDBListState currently keeps whole list of data in single RocksDB > key-value pair, which implies that the list actually must fit into memory. > Larger lists are not supported and end up with OOME or other error. The > RocksDBListState could be modified so that individual items in list are > stored in separate keys in RocksDB and can then be iterated over. A simple > implementation could reuse existing RocksDBMapState, with key as index to the > list and a single RocksDBValueState keeping track of how many items has > already been added to the list. Because this implementation might be less > efficient in come cases, it would be good to make it opt-in by a construct > like > {{new RocksDBStateBackend().enableLargeListsPerKey()}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)