[ https://issues.apache.org/jira/browse/FLINK-37109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17912769#comment-17912769 ]
Gabor Somogyi commented on FLINK-37109: --------------------------------------- [~lamgary] Thanks for investing such amount of time into this issue. We're facing this problem as well and intended to help with it. Sadly I'm under pressure with other tasks in the upcoming weeks but I'm intended to visit back. If you can prepare a PR with in-depth analysis facts(some flame graphs, explanation, test apps, etc...) that would help. > Improve state processor API performance when reading keyed rocksdb state by > allowing duplicates > ----------------------------------------------------------------------------------------------- > > Key: FLINK-37109 > URL: https://issues.apache.org/jira/browse/FLINK-37109 > Project: Flink > Issue Type: Improvement > Components: API / State Processor > Reporter: Gary Lam > Priority: Minor > > Could we allow for duplicates via a flag when reading keyed rocksdb state, to > improve performance? > From the [mailing list > discussion,|https://www.mail-archive.com/user@flink.apache.org/msg43863.html] > when the state processor api reads from state, it does multiple reads/writes > to avoid duplicates: > > {code:java} > The trick we perform is to delete keys from rocksDB after each read, so we > can do full table scans on all column families but never see any > duplicates.{code} > > In my application, which has a keyed state of size ~200GB, I have found it > takes >4 hours to iterate the entire state. Doing a CPU profile, 70% of the > time is spent on the `remove()` rocksdb call. > If I comment out [this > line|https://github.com/apache/flink/blob/26436ac27ae9e4705910b0502abb5bdd33ec686b/flink-libraries/flink-state-processing-api/src/main/java/org/apache/flink/state/api/input/KeyedStateInputFormat.java#L229] > `keysAndNamespaces.remove();`, I can read the entire state in <15 minutes, > and my particular application (trying to detect outliers in the state) is > robust to duplicates. > Thus if we allow this to be a user configurable flag (to skip deduplication) > it would give a performance boost to users who don't care about > deduplication. > -- This message was sent by Atlassian Jira (v8.20.10#820010)