cc'ing few folks who are interested in this discussion. On Tue, Apr 9, 2019 at 11:35 AM Kailash Dayanand <kdayan...@lyft.com> wrote:
> Hello, > > I am looking to contribute a ProtoParquetWriter support which can be used > in Bulk format for the StreamingFileSink api. There has been earlier > discussions on this in the user mailing list: https://goo.gl/ya2StL and > thought it would be a good addition to have. > > For implementation, looking at the current APIs present at > ProtoParquetWriter with the parguet project (http://tinyurl.com/y378be42), > it looks like there is some different in the interface between Avro and > Proto writes (ProtoParquetWriter does not have a builder class as well as > not interface with Outputfile). Due to this, I was looking at directly > extending the ParquetWriter within Flink to define the Builder static class > and have newer interfaces. This is needed as the bulk writer takes a > builder to crate the ParquetWriter in the bulkWriter.Factory. ( > http://tinyurl.com/yyg9cn9b) > > Any thoughts if this is a reasonable approach? > > Thanks > Kailash >