That looks good. Thanks! On Fri, Aug 19, 2016 at 6:15 AM Robert Metzger <rmetz...@apache.org> wrote:
> Hi Wes, > > Flink's own OutputFormats don't support compression, but we have some > tools to use Hadoop's OutputFormats with Flink [1], and those support > compression: > https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/mapreduce/lib/output/FileOutputFormat.html > > Let me know if you need more information. > > Regards, > Robert > > [1]: > https://ci.apache.org/projects/flink/flink-docs-master/apis/batch/hadoop_compatibility.html > > > On Thu, Aug 18, 2016 at 2:13 AM, Wesley Kerr <wesley.n.k...@gmail.com> > wrote: > >> Hello - >> >> Forgive me if this has been asked before, but I'm trying to determine the >> best way to add compression to DataSink Outputs (starting with >> TextOutputFormat). Realistically I would like each partition file >> (based on parallelism) to be compressed independently with gzip, but am >> open to other solutions. >> >> My first thought was to extend TextOutputFormat with a new class that >> compresses after closing and before returning, but I'm not sure that would >> work across all possible file systems (S3, Local, and HDFS). >> >> Any thoughts? >> >> Thanks! >> >> Wes >> >> >> >