Hi, I am having issues with the global store taking a very long time to restore during startup of a Kafka Streams 2.0.1 application. The global store is backed by a RocksDB persistent store and is added to the Streams topology in the following manner: https://pastebin.com/raw/VJutDyYe The global store topic has approximately 15 million records per partition and 18 partitions. The following global consumer settings are specified:
poll.timeout.ms = 10 max.poll.records = 2000 max.partition.fetch.bytes = 1048576 fetch.max.bytes = 52428800 receive.buffer.bytes = 65536 I have tried tweaking the settings above on the consumer side, such as increasing poll.timeout.ms to 2000, max.poll.records to 10000, and max.partition.fetch.bytes to 52428800, but it seems that I keep hitting a ceiling of restoring approximately 100,000 records per second. With 15 million records per partition, it takes approximately 150 seconds to restore a single partition. With 18 partitions, it takes roughly 45 minutes to fully restore the global store. Switching from HDDs to SSDs on the brokers' log directories made restoration roughly 25% faster overall, but this still feels slow. It seems that I am hitting IOPS limits on the disks and am not even close to hitting the throughput limits of the disks on either the broker or streams application side. How can I minimize restoration time of a global store? Are there settings that can increase throughput with the same number of IOPS? Ideally restoration of each partition could be done in parallel but I recognize there is only a single global store thread. Bringing up a new instance of the Kafka Streams application occurs on a potentially daily basis, so the restoration time is becoming more and more of a hassle. Thanks. Taylor