Thanks Vino. I am able to write data in parquet now. But now the issue is how to write a dataset to multiple output path as per timestamp partition. I want to partition data on date wise.
I am writing like this currently that will write to single output path. DataSet<Tuple2<Void,GenericRecord>> df = allEvents.flatMap(new EventMapProcessor(schema.toString())).withParameters(configuration); Job job = Job.getInstance(); AvroParquetOutputFormat.setSchema(job, book_bike.getClassSchema()); HadoopOutputFormat parquetFormat = new HadoopOutputFormat<Void, GenericRecord>(new AvroParquetOutputFormat(), job); FileOutputFormat.setOutputPath(job, new Path(outputDirectory)); df.output(parquetFormat); env.execute(); Please suggest. Thanks, Anuj On Mon, Dec 23, 2019 at 12:59 PM vino yang <yanghua1...@gmail.com> wrote: > Hi Anuj, > > After searching in Github, I found a demo repository about how to use > parquet in Flink.[1] > > You can have a look. I can not make sure whether it is helpful or not. > > [1]: https://github.com/FelixNeutatz/parquet-flinktacular > > Best, > Vino > > aj <ajainje...@gmail.com> 于2019年12月21日周六 下午7:03写道: > >> Hello All, >> >> I am getting a set of events in JSON that I am dumping in the hourly >> bucket in S3. >> I am reading this hourly bucket and created a DataSet<String>. >> >> I want to write this dataset as a parquet but I am not able to figure >> out. Can somebody help me with this? >> >> >> Thanks, >> Anuj >> >> >> <http://www.cse.iitm.ac.in/%7Eanujjain/> >> > -- Thanks & Regards, Anuj Jain Mob. : +91- 8588817877 Skype : anuj.jain07 <http://www.oracle.com/> <http://www.cse.iitm.ac.in/%7Eanujjain/>