@Oscar 1. How do you plan to use that CSV data? Is it needed for lookup from the "main" stream? 2. Which API are you using? DataStream/SQL/Table or low level ProcessFunction?
Best, Alex On Tue, 4 Jul 2023 at 11:14, Oscar Perez via user <user@flink.apache.org> wrote: > ok, but is it? As I said, both sources have different data types. In the > example here: > > > https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/datastream/hybridsource/ > > We are using both sources as returning string but in our case, one source > would return a protobuf event while the other would return a pojo. How can > we make the 2 sources share the same datatype so that we can successfully > use hybrid source? > > Regards, > Oscar > > On Tue, 4 Jul 2023 at 12:04, Alexey Novakov <ale...@ververica.com> wrote: > >> Hi Oscar, >> >> You could use connected streams and put your file into a special Kafka >> topic before starting such a job: >> https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/dev/datastream/operators/overview/#connect >> But this may require more work and the event ordering (which is shuffled) >> in the connected streams is probably not what you are looking for. >> >> I think HybridSource is the right solution. >> >> Best regards, >> Alexey >> >> On Mon, Jul 3, 2023 at 3:44 PM Oscar Perez via user < >> user@flink.apache.org> wrote: >> >>> Hei, We want to bootstrap some data from a CSV file before reading from >>> a kafka topic that has a retention period of 7 days. >>> >>> We believe the best tool for that would be the HybridSource but the >>> problem we are facing is that both datasources are of different nature. The >>> KafkaSource returns a protobuf event while the CSV is a POJO with just 3 >>> fields. >>> >>> We could hack the kafkasource implementation and then in the >>> valuedeserializer do the mapping from protobuf to the CSV POJO but that >>> seems rather hackish. Is there a way more elegant to unify both datatypes >>> from both sources using Hybrid Source? >>> >>> thanks >>> Oscar >>> >>