One of the most appealing features of the streams-based architecture is the ability to replay history. This concept was highlighted in a blog post [0] just the other day.
Practically, though, I am stuck on the mechanics of replaying data when that data is also periodically expiring. If your logs expire after some time, how can you replay state? This may not be a problem for certain kinds of analysis, especially windowed analysis. However, lets say your retention topic consists of logical application events like "user-create" and "user-update". If the "user-create" event is deleted, subsequent "user-update" events for that user are no longer replayable. The streams applications transforms "user-create" and "user-update" events into a compacted entity topic "user". This topic can be replayed, but that is different from replaying the actual events that produced the compacted entity. So how do I make sense of retention and replay? Thank you, Dmitry [0] https://www.confluent.io/blog/messaging-single-source-truth/