Hi all, I’m developing an event-based file source to continuously monitor an S3 bucket. The problem with the existing file source is that continuously listing the bucket is expensive, and the state grows with the number of files.
I was thinking of using SQS and listening for *ObjectCreated* events instead of polling the bucket. I’m currently considering two design alternatives: 1. *Periodic enumerator* – The enumerator is triggered periodically and drains the SQS queue. Each S3 object becomes a split, similar to how Flink’s current file source works. 2. *Single-reader enumerator* – The enumerator simply assigns the SQS queue to a single reader, which continuously consumes it. In this model, there is a single split (the SQS queue itself), similar to how FLIP-27 <https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface> treats Kafka partitions as splits assigned to readers. Has anyone worked on a similar approach or explored event-driven file sources before?
