[jira] [Updated] (FLINK-37109) Improve state processor API performance when reading keyed rocksdb state by allowing duplicates

Gary Lam (Jira) Sun, 09 Feb 2025 12:20:12 -0800


     [ 
https://issues.apache.org/jira/browse/FLINK-37109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Gary Lam updated FLINK-37109:
-----------------------------
    Attachment:     (was: Flame graph prior to change remove using 74pct of 
cpu.png)

> Improve state processor API performance when reading keyed rocksdb state by 
> allowing duplicates
> -----------------------------------------------------------------------------------------------
>
>                 Key: FLINK-37109
>                 URL: https://issues.apache.org/jira/browse/FLINK-37109
>             Project: Flink
>          Issue Type: Improvement
>          Components: API / State Processor
>            Reporter: Gary Lam
>            Priority: Minor
>              Labels: pull-request-available
>         Attachments: Flame graph prior to change remove using 74pct of 
> cpu.png, Flame graph prior to change using 74pct cpu.png
>
>
> Could we allow for duplicates via a flag when reading keyed rocksdb state, to 
> improve performance?
> From the [mailing list 
> discussion,|https://www.mail-archive.com/user@flink.apache.org/msg43863.html] 
> when the state processor api reads from state, it does multiple reads/writes 
> to avoid duplicates: 
>  
> {code:java}
> The trick we perform is to delete keys from rocksDB after each read, so we 
> can do full table scans on all column families but never see any 
> duplicates.{code}
>  
> In my application, which has a keyed state of size ~200GB, I have found it 
> takes >4 hours to iterate the entire state. Doing a CPU profile, 70% of the 
> time is spent on the `remove()` rocksdb call. 
> If I comment out [this 
> line|https://github.com/apache/flink/blob/26436ac27ae9e4705910b0502abb5bdd33ec686b/flink-libraries/flink-state-processing-api/src/main/java/org/apache/flink/state/api/input/KeyedStateInputFormat.java#L229]
>  `keysAndNamespaces.remove();`, I can read the entire state in <15 minutes, 
> and my particular application (trying to detect outliers in the state) is 
> robust to duplicates.
> Thus if we allow this to be a user configurable flag (to skip deduplication) 
> it would give a performance boost to users who don't care about 
> deduplication. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-37109) Improve state processor API performance when reading keyed rocksdb state by allowing duplicates

Reply via email to