[ https://issues.apache.org/jira/browse/KAFKA-12475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17305191#comment-17305191 ]
A. Sophie Blee-Goldman commented on KAFKA-12475: ------------------------------------------------ An alternative is to just start buffering updates in-memory (or in rocksdb, this could be configurable) and then avoid dirtying the remote storage in the first place as we would only flush the data out to it during a commit. I'm guessing at least some users may have disabled the changelog and rely on the fault tolerance of the remote storage instead, in which case the fix proposed here would require them to re-enable logging. This is more of an optimization to save on the changelog storage & network costs, but the optimization would also provide a solution for the base problem of EOS with remote stores > Kafka Streams breaks EOS with remote state stores > ------------------------------------------------- > > Key: KAFKA-12475 > URL: https://issues.apache.org/jira/browse/KAFKA-12475 > Project: Kafka > Issue Type: Bug > Components: streams > Reporter: A. Sophie Blee-Goldman > Priority: Major > Labels: needs-kip > > Currently in Kafka Streams, exactly-once semantics (EOS) require that the > state stores be completely erased and restored from the changelog from > scratch in case of an error. This erasure is implemented by closing the state > store and then simply wiping out the local state directory. This works fine > for the two store implementations provided OOTB, in-memory and rocksdb, but > fails when the application includes a custom StateStore based on remote > storage, such as Redis. In this case Streams will fail to erase any of the > data before reinserting data from the changelog, resulting in possible > duplicates and breaking the guarantee of EOS. -- This message was sent by Atlassian Jira (v8.3.4#803005)