Gary Lam created FLINK-37109:
--------------------------------

             Summary: Increase state processor API performance when reading 
keyed rocksdb state
                 Key: FLINK-37109
                 URL: https://issues.apache.org/jira/browse/FLINK-37109
             Project: Flink
          Issue Type: Improvement
          Components: API / State Processor
            Reporter: Gary Lam


Could we allow for duplicates via a flag when reading keyed rocksdb state, to 
improve performance?

>From the [mailing list 
>discussion,|https://www.mail-archive.com/user@flink.apache.org/msg43863.html] 
>when the state processor api reads from state, it does multiple reads/writes 
>to avoid duplicates: 

 
{code:java}
The trick we perform is to delete keys from rocksDB after each read, so we can 
do full table scans on all column families but never see any duplicates.{code}
 

In my application, which has a keyed state of size ~200GB, I have found it 
takes >4 hours to iterate the entire state. Doing a CPU profile, 70% of the 
time is spent on the `remove()` rocksdb call. 

If I comment out [this 
line|https://github.com/apache/flink/blob/26436ac27ae9e4705910b0502abb5bdd33ec686b/flink-libraries/flink-state-processing-api/src/main/java/org/apache/flink/state/api/input/KeyedStateInputFormat.java#L229]
 `keysAndNamespaces.remove();`, I can read the entire state in <15 minutes, and 
my particular application (trying to detect outliers in the state) is robust to 
duplicates.

Thus if we allow this to be a user configurable flag (to skip deduplication) it 
would give a performance boost to users who don't care about deduplication. 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to