Guozhang Wang created KAFKA-13239: ------------------------------------- Summary: Use RocksDB.ingestExternalFile for restoration Key: KAFKA-13239 URL: https://issues.apache.org/jira/browse/KAFKA-13239 Project: Kafka Issue Type: Improvement Components: streams Reporter: Guozhang Wang
Now that we are in newer version of RocksDB, we can consider using the new {code} ingestExternalFile(final ColumnFamilyHandle columnFamilyHandle, final List<String> filePathList, final IngestExternalFileOptions ingestExternalFileOptions) {code} for restoring changelog into state stores. More specifically: 1) Use larger default batch size in restore consumer polling behavior so that each poll would return more records as possible. 2) For a single batch of records returned from a restore consumer poll call, first write them as a single SST File using the {{SstFileWriter}}. The existing {{DBOptions}} could be used to construct the {{EnvOptions} and {{Options}} for the writter. Do not yet ingest the written file to the db yet within each iteration 3) At the end of the restoration, call {{RocksDB.ingestExternalFile}} given all the written files' path as the parameter. The {{IngestExternalFileOptions}} would be specifically configured to allow key range overlapping with mem-table. 4) A specific note is that after the call in 3), heavy compaction may be executed by RocksDB in the background and before it cools down, starting normal processing immediately which would try to {{put}} new records into the store may see high stalls. To work around it we would consider using {{RocksDB.compactRange()}} which would block until the compaction is completed. -- This message was sent by Atlassian Jira (v8.3.4#803005)