[ https://issues.apache.org/jira/browse/FLINK-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16389297#comment-16389297 ]
ASF GitHub Bot commented on FLINK-8845: --------------------------------------- GitHub user sihuazhou opened a pull request: https://github.com/apache/flink/pull/5650 [FLINK-8845][state] Introduce RocksDBWriteBatchWrapper to improve performance for recovery in RocksDB backend ## What is the purpose of the change This PR addresses [FLINK-8845](https://issues.apache.org/jira/browse/FLINK-8845), which attempts to use `WriteBatch` to improve the performance for loading data into RocksDB. It's inspired by [RocksDB FAQ](https://github.com/facebook/rocksdb/wiki/RocksDB-FAQ). ## Brief change log - *Introduce `RocksDBWriteBatchWrapper` to load data into RocksDB in bulk* ## Verifying this change - Introduce `RocksDBWriteBatchWrapperTest.java` to guard `RocksDBWriteBatchWrapper`. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes) - The S3 file system connector: (no) ## Documentation none You can merge this pull request into a Git repository by running: $ git pull https://github.com/sihuazhou/flink rocksdb_write_batch Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/5650.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5650 ---- commit e710287495d2a1a12a99b812c9691e12c6c57459 Author: sihuazhou <summerleafs@...> Date: 2018-03-07T05:58:45Z Introduce RocksDBWriteBatchWrapper to speed up write performance. ---- > Use WriteBatch to improve performance for recovery in RocksDB backend > --------------------------------------------------------------------- > > Key: FLINK-8845 > URL: https://issues.apache.org/jira/browse/FLINK-8845 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing > Affects Versions: 1.5.0 > Reporter: Sihua Zhou > Assignee: Sihua Zhou > Priority: Major > Fix For: 1.6.0 > > > Base on {{WriteBatch}} we could get 30% ~ 50% performance lift when loading > data into RocksDB. -- This message was sent by Atlassian JIRA (v7.6.3#76005)