Hi Oscar, You could use connected streams and put your file into a special Kafka topic before starting such a job: https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/dev/datastream/operators/overview/#connect But this may require more work and the event ordering (which is shuffled) in the connected streams is probably not what you are looking for.
I think HybridSource is the right solution. Best regards, Alexey On Mon, Jul 3, 2023 at 3:44 PM Oscar Perez via user <user@flink.apache.org> wrote: > Hei, We want to bootstrap some data from a CSV file before reading from a > kafka topic that has a retention period of 7 days. > > We believe the best tool for that would be the HybridSource but the > problem we are facing is that both datasources are of different nature. The > KafkaSource returns a protobuf event while the CSV is a POJO with just 3 > fields. > > We could hack the kafkasource implementation and then in the > valuedeserializer do the mapping from protobuf to the CSV POJO but that > seems rather hackish. Is there a way more elegant to unify both datatypes > from both sources using Hybrid Source? > > thanks > Oscar >