[jira] [Commented] (KAFKA-9062) Handle stalled writes to RocksDB

Sophie Blee-Goldman (Jira) Wed, 18 Dec 2019 11:58:41 -0800


    [ 
https://issues.apache.org/jira/browse/KAFKA-9062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16999467#comment-16999467
 ]


Sophie Blee-Goldman commented on KAFKA-9062:
--------------------------------------------

[~jpzk] I agree this is a bug in Streams (that's what this ticket is for :) ) 
but just to clarify, the PUT is stalled due to the manual compaction issued at 
the end of restoration which is taking too long to compact the excessive L0 
files, not due to an autocompaction triggered by the PUT itself. The reason 
disabling autocompaction prevents this from happening is that rocksdb just 
doesn't stall writes due to excessive L0 files with autocompaction off (which 
makes sense as otherwise you could get stuck forever)

As a very simple workaround, we could consider making the "bulk loading" mode 
optional/configurable (possibly through an augmented RocksDBConfigSetter). 
Users hitting this issue could simply keep autocompaction enabled during 
restoration to hopefully keep the L0 file count under control so new writes 
won't stall. This would in theory slow down the restoration, but I suspect we 
may not be gaining as much from the bulk loading mode as we think since the 
keys are unsorted

> Handle stalled writes to RocksDB
> --------------------------------
>
>                 Key: KAFKA-9062
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9062
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>            Reporter: Sophie Blee-Goldman
>            Priority: Major
>
> RocksDB may stall writes at times when background compactions or flushes are 
> having trouble keeping up. This means we can effectively end up blocking 
> indefinitely during a StateStore#put call within Streams, and may get kicked 
> from the group if the throttling does not ease up within the max poll 
> interval.
> Example: when restoring large amounts of state from scratch, we use the 
> strategy recommended by RocksDB of turning off automatic compactions and 
> dumping everything into L0. We do batch somewhat, but do not sort these small 
> batches before loading into the db, so we end up with a large number of 
> unsorted L0 files.
> When restoration is complete and we toggle the db back to normal (not bulk 
> loading) settings, a background compaction is triggered to merge all these 
> into the next level. This background compaction can take a long time to merge 
> unsorted keys, especially when the amount of data is quite large.
> Any new writes while the number of L0 files exceeds the max will be stalled 
> until the compaction can finish, and processing after restoring from scratch 
> can block beyond the polling interval



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-9062) Handle stalled writes to RocksDB

Reply via email to