Hi Alex,
You are right! There is no "exactly once magic" backed into the RocksDB
store code. The point is local vs remote. When a Kafka Streams client
closes dirty under EOS, the state (i.e., the content of the state store)
needs to be wiped out and to be re-created from scratch from the
changelog topic on the brokers. To wipe out the state the state
directory is deleted.
For a remote state store, the wiping out of the state directory would
not delete the contents of the remote state store before Kafka Streams
re-creates the content from scratch.
Wiping out the state directory, happens outside of the API implemented
by a state store.
Best,
Bruno
On 15.03.21 13:59, Alex Craig wrote:
" Another issue with 3rd party state stores could be violation of
exactly-once guarantee provided by kafka streams in the event of a failure
of streams application instance"
I've heard this before but would love to know more about how a custom state
store would be at any greater risk than RocksDB as far as exactly-once
guarantees are concerned. They all implement the same interface, so as
long as you're correctly implementing get(), put(), delete(), flush(), etc,
you should be fine right? In other words, I don't think there is any
special "exactly once magic" that is baked into the RocksDB store code. I
could be wrong though so I'd love to hear people's thoughts, thanks,
Alex C
On Sun, Mar 14, 2021 at 4:58 PM Parthasarathy, Mohan <mpart...@hpe.com>
wrote:
Thanks for the responses. In the worst case, I might have to keep both
rocksdb for local store and keep an external store like Redis.
-mohan
On 3/13/21, 8:53 PM, "Pushkar Deole" <pdeole2...@gmail.com> wrote:
Another issue with 3rd party state stores could be violation of
exactly-once guarantee provided by kafka streams in the event of a
failure
of streams application instance.
Kafka provides exactly once guarantee for consumer -> process ->
produce
through transactions and with the use of state store like redis, this
guarantee is weaker
On Sat, Mar 13, 2021 at 5:28 AM Guozhang Wang <wangg...@gmail.com>
wrote:
> Hello Mohan,
>
> I think what you had in mind works with Redis, since it is a remote
state
> store engine, it does not have the co-partitioning requirements as
local
> state stores.
>
> One thing you'd need to tune KS though is that with remote stores,
the
> processing latency may be larger, and since Kafka Streams process all
> records of a single partition in order, synchronously, you may need
to tune
> the poll interval configs etc to make sure KS would stay in the
consumer
> group and not trigger unnecessary rebalances.
>
> Guozhang
>
> On Thu, Mar 11, 2021 at 6:41 PM Parthasarathy, Mohan <
mpart...@hpe.com>
> wrote:
>
> > Hi,
> >
> > I have a use case where messages come in with some key gets
assigned some
> > partition and the state gets created. Later, key changes (but still
> > contains the old key in the message) and gets sent to a different
> > partition. I want to be able to grab the old state using the old
key
> before
> > creating the new state on this instance. Redis as a state store
makes it
> > easy to implement this where I can simply do a lookup before
creating the
> > state. I see an implementation here :
> >
>
https://github.com/andreas-schroeder/redisks/tree/master/src/main/java/com/github/andreas_schroeder/redisks
> >
> > Has anyone tried this ? Any caveats.
> >
> > Thanks
> > Mohan
> >
> >
>
> --
> -- Guozhang
>