[
https://issues.apache.org/jira/browse/KAFKA-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17339399#comment-17339399
]
Guozhang Wang commented on KAFKA-12748:
---------------------------------------
Thanks for making a pass on the new options! This is a good list.
> Explore new RocksDB options to consider enabling by default
> -----------------------------------------------------------
>
> Key: KAFKA-12748
> URL: https://issues.apache.org/jira/browse/KAFKA-12748
> Project: Kafka
> Issue Type: Improvement
> Components: streams
> Reporter: A. Sophie Blee-Goldman
> Priority: Major
>
> With the rocksdb version bump comes a lot of new options, some of which look
> interesting enough to explore for usage in Streams. We should try setting
> these as default options and run the benchmarks to look for any performance
> benefit (or decrease). See javadocs for all Options
> [here|https://javadoc.io/doc/org.rocksdb/rocksdbjni/latest/org/rocksdb/Options.html]
> Options.setAvoidUnnecessaryBlockingIO:
> - As the name suggest, avoids blocking/long-latency tasks by scheduling a
> background job to do it
> Options.setSkipCheckingSstFileSizesOnDbOpen:
> - Speeds up startup time if there are many sst files, could mean less
> overhead from things like rebalancing where tasks are migrated between
> clients or threads. Not sure how many sst files counts as "many", may be less
> useful now that we've disabled bulk loading
> Options.setBestEffortsRecovery:
> - Interesting feature to allow recovering missing files without the use
> of the WAL. Could be useful if the on-disk state is corrupted (eg user
> deletes a file) without needing to rebuild state from scratch. Though I'd
> want to dig in further to understand what exactly it does and does not do.
> Not a performance improvement but we should run the benchmarks to make sure
> it doesn't make the performance worse.
> Options.setWriteDbidToManifest:
> - Should be set to true if/when we ever need to rely on the DB id eg for
> backups. Also not a performance improvement but we should still benchmark
> this.
> Options.optimizeForSmallDb:
> - This one is definitely not something we should set by default, as
> "small" here means under 1GB. But it's probably worth at least calling out in
> the docs for those users who know their data set size (per store) is under a
> GB
--
This message was sent by Atlassian Jira
(v8.3.4#803005)