[ https://issues.apache.org/jira/browse/FLINK-8639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16370218#comment-16370218 ]
ASF GitHub Bot commented on FLINK-8639: --------------------------------------- Github user sihuazhou commented on a diff in the pull request: https://github.com/apache/flink/pull/5465#discussion_r169374194 --- Diff: flink-contrib/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBMapState.java --- @@ -400,7 +410,7 @@ public UV setValue(UV value) { /** An auxiliary utility to scan all entries under the given key. */ private abstract class RocksDBMapIterator<T> implements Iterator<T> { - static final int CACHE_SIZE_BASE = 1; + static final int CACHE_SIZE_BASE = 32; --- End diff -- Indeed, I don't think the memory is really a concern (at least when CACHE_SIZE_LIMIT=128), I almost like to change the CACHE_SIZE to a fixed value (like 128), What do you think? > Fix always need to seek multiple times when iterator RocksDBMapState > -------------------------------------------------------------------- > > Key: FLINK-8639 > URL: https://issues.apache.org/jira/browse/FLINK-8639 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing > Affects Versions: 1.4.0 > Reporter: Sihua Zhou > Assignee: Sihua Zhou > Priority: Critical > Fix For: 1.5.0 > > > Currently, almost every time we want to iterator a RocksDBMapState we need to > do seek at least 2 times (Seek is a poor performance action for rocksdb cause > it can't use the bloomfilter). This is because `RocksDBMapIterator` use a > `cacheEntries` to cache the seek values every time and the `cacheEntries`'s > init size is 1. -- This message was sent by Atlassian JIRA (v7.6.3#76005)