I'm building a prototype with Kafka Streams that will be consuming from the same topic twice, once with no delay, just like any normal consumer, and once with a 60 minute delay, using the new timestamp-per-message field. It will also store state coming from other topics that are being read simultaneously.
The reason why I'm consuming twice from the same topic, with one of them delayed, is that our processor needs to know, for any particular record, if there are any more records related to that coming within the next 60 minutes, and change some of the fields accordingly, before sending them down as the final version of that event. Everything seems to be supported by the normal Streams DSL, except for a way to specify a consumption delay from a particular source topic. I know I can just create a delayed topic myself, and then consume from that, but the topic volume in this case prevents us from doing this. I don't want to duplicate the data and load in the cluster. This topic currently handles ~2B records per day, and will grow in the future to potentially 10B records per day. Any ideas on how I could handle this using Kafka Streams? Or if it's better to use the lower level Streams API for that, can you point me to a starting point? I've been reading docs and javadocs for a while now, and I'm not sure where I would add/configure this. Thanks! Marcos Juarez