[ https://issues.apache.org/jira/browse/FLINK-8790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16488360#comment-16488360 ]
ASF GitHub Bot commented on FLINK-8790: --------------------------------------- Github user sihuazhou commented on the issue: https://github.com/apache/flink/pull/5582 Unfortunately, after confirming with RocksDB, the `deleteRange()` is still an experimental feature, it may have impact on read performance currently(event thought we could use the ReadOption to reduce the impaction). In practice, I tested the impact of read performance of `deleteRange()` in our case (only delete 2 ranges at most), I didn't find any impact in fact. And the TiKV has already used it to delete entire shards. But, to be on the safe side, I think the current PR should be frozen, but I think the implementation base on `deleteRange()` in this PR should be a better implementation(especially when user scaling up the job, in that case we only need to clip the RocksDB without iterating any records, a super fast way) if the `deleteRange()` is no longer a feature of experimental. Anyways, even although we can't use the `deleteRange()` currently, but we can still improve the performance of the incremental checkpoint in somehow. We can improve it the by the follow way: if one of the state handle's key-group is a sub-range of the target key-group range. we can open it directly to prevent the overhead of iterating it. @StefanRRichter What do you think? If you don't object this, I will update the PR follow the above approach. > Improve performance for recovery from incremental checkpoint > ------------------------------------------------------------ > > Key: FLINK-8790 > URL: https://issues.apache.org/jira/browse/FLINK-8790 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing > Affects Versions: 1.5.0 > Reporter: Sihua Zhou > Assignee: Sihua Zhou > Priority: Major > Fix For: 1.6.0 > > > When there are multi state handle to be restored, we can improve the > performance as follow: > 1. Choose the best state handle to init the target db > 2. Use the other state handles to create temp db, and clip the db according > to the target key group range (via rocksdb.deleteRange()), this can help use > get rid of the `key group check` in > `data insertion loop` and also help us get rid of traversing the useless > record. -- This message was sent by Atlassian JIRA (v7.6.3#76005)