Hi Alex,

You are right! There is no "exactly once magic" backed into the RocksDB store code. The point is local vs remote. When a Kafka Streams client closes dirty under EOS, the state (i.e., the content of the state store) needs to be wiped out and to be re-created from scratch from the changelog topic on the brokers. To wipe out the state the state directory is deleted.

For a remote state store, the wiping out of the state directory would not delete the contents of the remote state store before Kafka Streams re-creates the content from scratch.

Wiping out the state directory, happens outside of the API implemented by a state store.

Best,
Bruno

On 15.03.21 13:59, Alex Craig wrote:
" Another issue with 3rd party state stores could be violation of
exactly-once guarantee provided by kafka streams in the event of a failure
of streams application instance"

I've heard this before but would love to know more about how a custom state
store would be at any greater risk than RocksDB as far as exactly-once
guarantees are concerned.  They all implement the same interface, so as
long as you're correctly implementing get(), put(), delete(), flush(), etc,
you should be fine right?  In other words, I don't think there is any
special "exactly once magic" that is baked into the RocksDB store code.  I
could be wrong though so I'd love to hear people's thoughts, thanks,

Alex C

On Sun, Mar 14, 2021 at 4:58 PM Parthasarathy, Mohan <mpart...@hpe.com>
wrote:

Thanks for the responses. In the worst case, I might have to keep both
rocksdb for local store and keep an external store like Redis.

-mohan


On 3/13/21, 8:53 PM, "Pushkar Deole" <pdeole2...@gmail.com> wrote:

     Another issue with 3rd party state stores could be violation of
     exactly-once guarantee provided by kafka streams in the event of a
failure
     of streams application instance.
     Kafka provides exactly once guarantee for consumer -> process ->
produce
     through transactions and with the use of state store like redis, this
     guarantee is weaker

     On Sat, Mar 13, 2021 at 5:28 AM Guozhang Wang <wangg...@gmail.com>
wrote:

     > Hello Mohan,
     >
     > I think what you had in mind works with Redis, since it is a remote
state
     > store engine, it does not have the co-partitioning requirements as
local
     > state stores.
     >
     > One thing you'd need to tune KS though is that with remote stores,
the
     > processing latency may be larger, and since Kafka Streams process all
     > records of a single partition in order, synchronously, you may need
to tune
     > the poll interval configs etc to make sure KS would stay in the
consumer
     > group and not trigger unnecessary rebalances.
     >
     > Guozhang
     >
     > On Thu, Mar 11, 2021 at 6:41 PM Parthasarathy, Mohan <
mpart...@hpe.com>
     > wrote:
     >
     > > Hi,
     > >
     > > I have a use case where messages come in with some key gets
assigned some
     > > partition and the state gets created. Later, key changes (but still
     > > contains the old key in the message) and gets sent to a different
     > > partition. I want to be able to grab the old state using the old
key
     > before
     > > creating the new state on this instance. Redis as a  state store
makes it
     > > easy to implement this where I can simply do a lookup before
creating the
     > > state. I see an implementation here :
     > >
     >
https://github.com/andreas-schroeder/redisks/tree/master/src/main/java/com/github/andreas_schroeder/redisks
     > >
     > > Has anyone tried this ? Any caveats.
     > >
     > > Thanks
     > > Mohan
     > >
     > >
     >
     > --
     > -- Guozhang
     >




Reply via email to