Hello, I read in the docs that Kafka Streams stores the computed aggregations in a local embedded key-value store (RocksDB by default), i.e., Kafka Streams provides so-called state stores. I'm wondering about the relationship between each state store and its replicated changelog Kafka topic.
If we use the WorkCountDemo example I would like to know what are the sequence of events to understand both concepts when we execute something like that: I send to topic the words: "hello", "world", "hello" Changelog topic messages: "hello" => 1 "world" => 1 "hello" => 2 It's ok. Q1) What is the status of RocksDB at this moment? Q2) If I delete all data in the changelog Kafka topic and send a new "hello", I can see in the changelog topic: "hello" => 3 What is happening? CountByKey are counting using the RocksDB data stored prior to updating the changelog again? Q3) Could someone explain me in depth the following process step by step? What happen if the changelog Kafka topic was deleted as I did in the question above? "If tasks run on a machine that fails and are restarted on another machine, Kafka Streams guarantees to restore their associated state stores to the content before the failure by replaying the corresponding changelog topics prior to resuming the processing on the newly started tasks." Q4) What are the differences with operations without changelog Kafka topic associated like joins between two KTables when machine fails occurs and we need fault tolerance policy? Thanks!