Hi everyone,

I've been reading a lot about new features in Kafka Streams and everything
looks very promising. There is even an article on Kafka and Event Sourcing:
https://www.confluent.io/blog/event-sourcing-cqrs-stream-processing-apache-kafka-whats-connection/

There are a couple of things that I'm concerned about though. For Event
Sourcing it is assumed that there is a way to fetch all events for a
particular object and replay them in order to get "latest snapshot" of that
object.

It seems like (and the article says so) that StateStore in KafkaStreams can
be used to achieve that.

My first question is would it scale well for millions of objects?
I understand that StateStore is backed by a compacted Kafka topic so in an
event of failure KafkaStreams will recover to the latest state by reading
all messages from that topic. But my suspicion is that for millions of
objects this may take a while (it would need to read the whole partition
for each object), is this a correct assumption?

My second question is would it make more sense to use an external DB in
such case or is there a "best practice" around implementing Event Sourcing
and using Kafka's internal StateStore as EventStore?

Thanks,
Anatoly

Reply via email to