I am planning to have a producer writing payment messages to a Kafka topic. One attribute of the messages would be process date, which could be in the future, i.e. the payment is not to be sent for collection until this date.
How can I configure Kafka so that a stream will only contain the messages where the process date is today (or in the past), i.e. that are ready to be sent for collection? Here are some ideas that I have but I am new to Kafka and I am not sure what is possible and/or practical ; (i) I can imagine my topic being 'converted' to a KTABLE which would store the latest version of each payment, and would effectively be a place to persist payments that are not ready for collection yet. Each day I could start a stream, reading from the KTABLE where process_date <= 'today' where today is a constant string derived from the system date. This stream would therefore contain any messages that were written in the past and are only being picked up now, plus any new messages that are written today with today's date. I would then shut down the stream at the end of the day and start a new one the next day with a new value in 'today'. (ii) I can imagine a similar stream, reading from the same KTABLE where process_date <= 'today' where today is a KSQL system variable. In this scenario I would not expect to have to start and stop the stream each day, and would be relying on the stream being automatically updated at midnight with the new current date so that the stream starts to process messages that match the new criteria. I am struggling to see how this could work as I imagine that the stream would be relying on an updated row in the KTABLE rather than an updated value in 'today'. Obviously there may be other approaches. I am all ears. Thank you in advance for any advice given. Dave O'Connor