[jira] [Commented] (KAFKA-12748) Explore new RocksDB options to consider enabling by default

Guozhang Wang (Jira) Tue, 04 May 2021 21:43:04 -0700


    [ 
https://issues.apache.org/jira/browse/KAFKA-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17339399#comment-17339399
 ]


Guozhang Wang commented on KAFKA-12748:
---------------------------------------

Thanks for making a pass on the new options! This is a good list.

> Explore new RocksDB options to consider enabling by default
> -----------------------------------------------------------
>
>                 Key: KAFKA-12748
>                 URL: https://issues.apache.org/jira/browse/KAFKA-12748
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: A. Sophie Blee-Goldman
>            Priority: Major
>
> With the rocksdb version bump comes a lot of new options, some of which look 
> interesting enough to explore for usage in Streams. We should try setting 
> these as default options and run the benchmarks to look for any performance 
> benefit (or decrease). See javadocs for all Options 
> [here|https://javadoc.io/doc/org.rocksdb/rocksdbjni/latest/org/rocksdb/Options.html]
> Options.setAvoidUnnecessaryBlockingIO: 
>     - As the name suggest, avoids blocking/long-latency tasks by scheduling a 
> background job to do it
> Options.setSkipCheckingSstFileSizesOnDbOpen:
>     - Speeds up startup time if there are many sst files, could mean less 
> overhead from things like rebalancing where tasks are migrated between 
> clients or threads. Not sure how many sst files counts as "many", may be less 
> useful now that we've disabled bulk loading 
>  Options.setBestEffortsRecovery: 
>     - Interesting feature to allow recovering missing files without the use 
> of the WAL. Could be useful if the on-disk state is corrupted (eg user 
> deletes a file) without needing to rebuild state from scratch. Though I'd 
> want to dig in further to understand what exactly it does and does not do. 
> Not a performance improvement but we should run the benchmarks to make sure 
> it doesn't make the performance worse.
> Options.setWriteDbidToManifest:
>     - Should be set to true if/when we ever need to rely on the DB id eg for 
> backups. Also not a performance improvement but we should still benchmark 
> this.
> Options.optimizeForSmallDb:
>     - This one is definitely not something we should set by default, as 
> "small" here means under 1GB. But it's probably worth at least calling out in 
> the docs for those users who know their data set size (per store) is under a 
> GB



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-12748) Explore new RocksDB options to consider enabling by default

Reply via email to