[ https://issues.apache.org/jira/browse/FLINK-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16364082#comment-16364082 ]
ASF GitHub Bot commented on FLINK-8297: --------------------------------------- Github user je-ik commented on the issue: https://github.com/apache/flink/pull/5185 @aljoscha I updated the title. I'm a little concerned about the serialization in savepoint. If the serialization is *exactly* the same, doesn't that actually mean that again, the whole List will be stored in single byte[], which will OOME for cases which the user wanted to solve by activating the "large list" implementation? Or am I missing something? > RocksDBListState stores whole list in single byte[] > --------------------------------------------------- > > Key: FLINK-8297 > URL: https://issues.apache.org/jira/browse/FLINK-8297 > Project: Flink > Issue Type: Improvement > Components: Core > Affects Versions: 1.4.0, 1.3.2 > Reporter: Jan Lukavský > Priority: Major > > RocksDBListState currently keeps whole list of data in single RocksDB > key-value pair, which implies that the list actually must fit into memory. > Larger lists are not supported and end up with OOME or other error. The > RocksDBListState could be modified so that individual items in list are > stored in separate keys in RocksDB and can then be iterated over. A simple > implementation could reuse existing RocksDBMapState, with key as index to the > list and a single RocksDBValueState keeping track of how many items has > already been added to the list. Because this implementation might be less > efficient in come cases, it would be good to make it opt-in by a construct > like > {{new RocksDBStateBackend().enableLargeListsPerKey()}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)