[ https://issues.apache.org/jira/browse/FLINK-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16390632#comment-16390632 ]
ASF GitHub Bot commented on FLINK-8845: --------------------------------------- Github user sihuazhou commented on a diff in the pull request: https://github.com/apache/flink/pull/5650#discussion_r173049537 --- Diff: flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBWriteBatchWrapper.java --- @@ -0,0 +1,86 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.contrib.streaming.state; + +import org.apache.flink.util.Preconditions; +import org.rocksdb.ColumnFamilyHandle; +import org.rocksdb.RocksDB; +import org.rocksdb.RocksDBException; +import org.rocksdb.WriteBatch; +import org.rocksdb.WriteOptions; + +import javax.annotation.Nonnull; + +/** + * A wrapper class to wrap WriteBatch. + */ +public class RocksDBWriteBatchWrapper implements AutoCloseable { + + private final static int MIN_CAPACITY = 100; + private final static int MAX_CAPACITY = 10000; + + private final RocksDB db; + + private final WriteBatch batch; + + private final WriteOptions options; + + private final int capacity; + + private int currentSize; + + public RocksDBWriteBatchWrapper(@Nonnull RocksDB rocksDB, + @Nonnull WriteOptions options, + int capacity) { + + Preconditions.checkArgument(capacity >= MIN_CAPACITY && capacity <= MAX_CAPACITY, + "capacity should at least greater than 100"); --- End diff -- About the capacity range, I didn't find a specific value recommend by RocksDB, but from [FAQ](https://github.com/facebook/rocksdb/wiki/RocksDB-FAQ) ``` Q: What's the fastest way to load data into RocksDB? ... 2. batch hundreds of keys into one write batch ... ``` I found that they use the word `hundreds`. > Use WriteBatch to improve performance for recovery in RocksDB backend > --------------------------------------------------------------------- > > Key: FLINK-8845 > URL: https://issues.apache.org/jira/browse/FLINK-8845 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing > Affects Versions: 1.5.0 > Reporter: Sihua Zhou > Assignee: Sihua Zhou > Priority: Major > Fix For: 1.6.0 > > > Base on {{WriteBatch}} we could get 30% ~ 50% performance lift when loading > data into RocksDB. -- This message was sent by Atlassian JIRA (v7.6.3#76005)