Re: Flink Stream job to parquet sink

2020-06-29 Thread aj
Thanks, Arvid. I will try to implement using the broadcast approach. On Fri, Jun 26, 2020 at 1:10 AM Arvid Heise wrote: > Hi Anuj, > > Yes, broadcast sounds really good. Now you just need to hide the > structural invariance (variable number of sinks) by delegating to inner > sinks. > > public cl

Re: Flink Stream job to parquet sink

2020-06-25 Thread Arvid Heise
Hi Anuj, Yes, broadcast sounds really good. Now you just need to hide the structural invariance (variable number of sinks) by delegating to inner sinks. public class SplittingStreamingFileSink extends RichSinkFunction implements CheckpointedFunction, CheckpointListener { Map si

Re: Flink Stream job to parquet sink

2020-06-25 Thread aj
Thanks, Arvide for detailed answers. - Have some kind submitter that restarts flink automatically on config change (assumes that restart time is not the issue). Yes, that can be written but that not solve the problem completely because I want to avoid job restart itself. Every time I restart I al

Re: Flink Stream job to parquet sink

2020-06-25 Thread Rafi Aroch
Hi Arvid, Would it be possible to implement a BucketAssigner that for example loads the configuration periodically from an external source and according to the event type decides on a different sub-folder? Thanks, Rafi On Wed, Jun 24, 2020 at 10:53 PM Arvid Heise wrote: > Hi Anuj, > > There i

Re: Flink Stream job to parquet sink

2020-06-24 Thread Arvid Heise
Hi Anuj, There is currently no way to dynamically change the topology. It would be good to know why your current approach is not working (restart taking too long? Too frequent changes?) So some ideas: - Have some kind submitter that restarts flink automatically on config change (assumes that rest

Re: Flink Stream job to parquet sink

2020-06-22 Thread aj
I am stuck on this . Please give some suggestions. On Tue, Jun 9, 2020, 21:40 aj wrote: > please help with this. Any suggestions. > > On Sat, Jun 6, 2020 at 12:20 PM aj wrote: > >> Hello All, >> >> I am receiving a set of events in Avro format on different topics. I want >> to consume these and

Re: Flink Stream job to parquet sink

2020-06-09 Thread aj
please help with this. Any suggestions. On Sat, Jun 6, 2020 at 12:20 PM aj wrote: > Hello All, > > I am receiving a set of events in Avro format on different topics. I want > to consume these and write to s3 in parquet format. > I have written a below job that creates a different stream for each

Flink Stream job to parquet sink

2020-06-05 Thread aj
Hello All, I am receiving a set of events in Avro format on different topics. I want to consume these and write to s3 in parquet format. I have written a below job that creates a different stream for each event and fetches it schema from the confluent schema registry to create a parquet sink for a