I have 2 datasets that I need to join together in a Flink batch job. One of the datasets needs to be created dynamically by completely 'draining' a Kafka topic in an offset range (start and end), and create a file containing all messages in that range. I know that in Flink streaming I can specify the start offset, but not the end offset. In my case, this preparation of the file from kafka topic is really working on a finite, bounded set of data, even though it's from Kafka.
Is there a way that I can do this in Flink (either streaming or batch ? Thanks, Hayden