Re: Compressing files with the Bucketing Sink

2018-03-29 Thread lrao
Thanks a lot for the suggestion Till! I ended up using your suggestion of extending StreamWriterBase and wrapping the FSDataOutputStream with GZIPOutputStream. On 2018/03/28 09:44:26, Till Rohrmann wrote: > Hi, > > the SequenceFileWriter and the AvroKeyValueSinkWriter both support > co

Re: Compressing files with the Bucketing Sink

2018-03-28 Thread Till Rohrmann
Hi, the SequenceFileWriter and the AvroKeyValueSinkWriter both support compressed outputs. Apart from that, I'm not aware of any other Writers which support compression. Maybe you could use these two Writers as a guiding example. Alternatively, you could try to extend the StreamWriterBase and wrap

Compressing files with the Bucketing Sink

2018-03-27 Thread lrao
I want to upload a compressed file (gzip preferrably) using the Bucketing Sink. What is the best way to do this? Would I have to implement my own Writer that does the compression? Has anyone done something similar?