Hi, Here is our scenario: We have a system that generates data in a jsonl file for all of customers together. We now need to process this jsonl data and conditionally distribute the data to individual customer based on their preferences as Iceberg Tables. So every line in the jsonl file, the data will end up one of the customers S3 bucket as an Iceberg table row. We were hoping to continue using Flink for this use case by just one job doing a conditional sink, but we are not sure if that would be the right usage of Flink.
Thanks, Shree ________________________________ From: Fabian Paul <fp...@apache.org> Sent: Monday, November 29, 2021 1:57 AM To: SHREEKANT ANKALA <ask...@hotmail.com> Cc: user@flink.apache.org <user@flink.apache.org> Subject: Re: How to Fan Out to 100s of Sinks Hi, What do you mean by "fan out" to 100 different sinks? Do you want to replicate the data in all buckets or is there some conditional branching logic? In general, Flink can easily support 100 different sinks but I am not sure if this is the right approach for your use case. Can you clarify your motivation and tell us a bit more about the exact scenario? Best, Fabian On Mon, Nov 29, 2021 at 1:11 AM SHREEKANT ANKALA <ask...@hotmail.com> wrote: > > Hi all, we current have a Flink job that retrieves jsonl data from GCS and > writes to Iceberg Tables. We are using Flink 13.2 and things are working fine. > > We now have to fan out that same data in to 100 different sinks - Iceberg > Tables on s3. There will be 100 buckets and the data needs to be sent to each > of these 100 different buckets. > > We are planning to add a new Job that will write to 1 sink at a time for each > time it is launched. Is there any other optimal approach possible in Flink to > support this use case of 100 different sinks?