Hi Oscar,

You could use connected streams and put your file into a special Kafka
topic before starting such a job:
https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/dev/datastream/operators/overview/#connect
But this may require more work and the event ordering (which is shuffled)
in the connected streams is probably not what you are looking for.

I think HybridSource is the right solution.

Best regards,
Alexey

On Mon, Jul 3, 2023 at 3:44 PM Oscar Perez via user <user@flink.apache.org>
wrote:

> Hei, We want to bootstrap some data from a CSV file before reading from a
> kafka topic that has a retention period of 7 days.
>
> We believe the best tool for that would be the HybridSource but the
> problem we are facing is that both datasources are of different nature. The
> KafkaSource returns a protobuf event while the CSV is a POJO with just 3
> fields.
>
> We could hack the kafkasource implementation and then in the
> valuedeserializer do the mapping from protobuf to the CSV POJO but that
> seems rather hackish. Is there a way more elegant to unify both datatypes
> from both sources using Hybrid Source?
>
> thanks
> Oscar
>

Reply via email to