[ https://issues.apache.org/jira/browse/KAFKA-9062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17001259#comment-17001259 ]
Sophie Blee-Goldman commented on KAFKA-9062: -------------------------------------------- Just to clarify, the "simple workaround" wouldn't involve turning off batching, just autocompaction. I get the sense the suggested bulk loading configuration is aimed at the specific case where you would like to dump a large amount of data to rocks and then start querying it. This differs slightly from what Streams actually needs to do, which is restore all that data and then start _writing_ (and also reading) from it – it occurs to me that the bulk loading mode is not targeted at our specific use case, since queries would not be stalled by excessive L0 files/compaction, only writes. That said, as mentioned above I think we should solve this holistically – stalled writes can still happen for other reasons. I just want to point out that the bulk loading mode may not be what we think it is > Handle stalled writes to RocksDB > -------------------------------- > > Key: KAFKA-9062 > URL: https://issues.apache.org/jira/browse/KAFKA-9062 > Project: Kafka > Issue Type: Bug > Components: streams > Reporter: Sophie Blee-Goldman > Priority: Major > > RocksDB may stall writes at times when background compactions or flushes are > having trouble keeping up. This means we can effectively end up blocking > indefinitely during a StateStore#put call within Streams, and may get kicked > from the group if the throttling does not ease up within the max poll > interval. > Example: when restoring large amounts of state from scratch, we use the > strategy recommended by RocksDB of turning off automatic compactions and > dumping everything into L0. We do batch somewhat, but do not sort these small > batches before loading into the db, so we end up with a large number of > unsorted L0 files. > When restoration is complete and we toggle the db back to normal (not bulk > loading) settings, a background compaction is triggered to merge all these > into the next level. This background compaction can take a long time to merge > unsorted keys, especially when the amount of data is quite large. > Any new writes while the number of L0 files exceeds the max will be stalled > until the compaction can finish, and processing after restoring from scratch > can block beyond the polling interval -- This message was sent by Atlassian Jira (v8.3.4#803005)