Hi Kafka Users, I'm thinking through how to convert my application to use Kafka. I use an event sourcing model and something I do frequently is reprocess old events when I change a model schema or update my processing code.
In my current setup, I have few enough events that I can just load all the event types that feed into a model and sort them all and then reprocess them. There's starting to be enough events though now that loading/sorting events in memory is getting slow and sometimes causing OOM crashes. So one very attractive thing about Kafka is that all events are sorted so in theory, I just need to set a consumer's offset to 0 and things will just workâ˘. But I've read that each event should have its own topic which raises the question how do I reprocess a model that's pulling from multiple topics while maintaining the order of events across multiple topics. So for the User model, say I have two events, userCreated and userUpdated each with a timestamp and an entity_id pointing to the user. If I'm reprocessing these, is there a normal pattern for how to pull events in order from multiple topics? One solution I've thought of is for producers to publish events to both event-specific topics as well as model topics e.g. userCreated would get published to the "userCreated" topic as well as the "user" topic. Another is that the stream processor for User, when reprocessing, would just look at the next event from each topic it's pulling from and always pull the oldest one next. Slightly tricky code but doable. Thoughts?