Hi,

Have you tried constructing a Hybrid source from a File source created
with FileSource.forBulkFileFormat [1] and "gs://bucket" scheme [2]
directly?

[1]
https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/file/src/FileSource.html#forBulkFileFormat-org.apache.flink.connector.file.src.reader.BulkFormat-org.apache.flink.core.fs.Path...-
[2]
https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/filesystems/gcs/

Regards,
Roman

On Thu, Dec 9, 2021 at 1:04 PM Meghajit Mazumdar
<meghajit.mazum...@gojek.com> wrote:
>
> Hello,
>
> We have a requirement as follows:
>
> We want to stream events from 2 sources: Parquet files stored in a GCS 
> Bucket, and a Kafka topic.
> With the release of Hybrid Source in Flink 1.14, we were able to construct a 
> Hybrid Source which produces events from two sources: a FileSource which 
> reads data from a locally saved Parquet File, and a KafkaSource consuming 
> events from a remote Kafka broker.
>
> I was wondering if instead of using a local Parquet file, whether it is 
> possible to directly stream the file from a GCS bucket and construct a File 
> Source out of it at runtime ? The Parquet Files are quite big and it's a bit 
> expensive to download.
>
> Does Flink have such a functionality ? Or, has anyone come across such a use 
> case previously ? Would greatly appreciate some help on this.
>
> Looking forward to hearing from you.
>
> Thanks,
> Megh

Reply via email to