Hi Oscar, I think you'll find your answers in [1], have a look at Yun's response a couple emails down. Basically, SourceFunction is the legacy source stack, and ideally you'd instead implement your source using the FLIP-27 stack[2] where you can directly define the boundedness, but he also mentioned a workaround.
Regards Ingo [1] http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Using-Kafka-as-bounded-source-with-DataStream-API-in-batch-mode-Flink-1-12-td40637.html [2] https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/datastream/sources/#the-data-source-api On Thu, Jun 3, 2021 at 7:29 AM 陳樺威 <oscar8492...@gmail.com> wrote: > Hi, > > Currently, we want to use batch execution mode [0] to consume historical > data and rebuild states for our streaming application. > The Flink app will be run on-demand and close after complete all the file > processing. > We implement a SourceFuntion [1] to consume bounded parquet files from > GCS. However, the function will be detected as Batch Mode. > > Our question is, how to implement a SourceFunction as a Bounded DataStream? > > Thanks! > Oscar > > [0] > https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/datastream/execution_mode/ > [1] > https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/streaming/api/functions/source/SourceFunction.html > > > >