[ https://issues.apache.org/jira/browse/KAFKA-7041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17006460#comment-17006460 ]
Guozhang Wang commented on KAFKA-7041: -------------------------------------- Today we do not trigger restoreCallback#onStart/Restore/End on updating standby tasks, only for restoring active tasks; when I looked at this piece of the code I feel maybe it is better to first look into larger (sorted) batching of records first, since today we would apply to the restore logic with each set of records that a single consumer.poll() returns, and if there are many partitions to fetch each partition may only return a small number of records but still we apply them to the store immediately, which may leads to smaller L0 files (we use the batching restorer for RocksDB which calls writeBatch). A possible optimization is to wait and not apply after each polled records, cc [~cadonna] > Using RocksDB bulk loading for StandbyTasks > ------------------------------------------- > > Key: KAFKA-7041 > URL: https://issues.apache.org/jira/browse/KAFKA-7041 > Project: Kafka > Issue Type: Improvement > Components: streams > Reporter: Matthias J. Sax > Assignee: Nikki Thean > Priority: Major > > InĀ KAFKA-5363 we introduced RocksDB bulk loading to speed up store recovery. > We could do the same optimization for StandbyTasks to make them more > efficient and to reduce the likelihood that StandbyTasks lag behind. -- This message was sent by Atlassian Jira (v8.3.4#803005)