Hi all, We're interested in being able to use a FileSource <https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/connector/file/src/FileSource.html> read from a Google Cloud Storage (GCS) archive of messages from a Kafka topic, roughly in order.
Our GCS archive is partitioned into folders by time, however, when we read it using a FileSource, the messages are processed in a random order. We'd like to be able to control what order the files are read in, and take advantage of the clear ordering our GCS archive provides. What is the best way to achieve this? Would it be possible to write a custom FileEnumerator <https://nightlies.apache.org/flink/flink-docs-release-1.14/api/java/org/apache/flink/connector/file/src/enumerate/FileEnumerator.html> that sorts the directories and returns the splits in order? Any help would be greatly appreciated! Thanks, Kevin