[ https://issues.apache.org/jira/browse/FLINK-6185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15941736#comment-15941736 ]
Greg Hogan commented on FLINK-6185: ----------------------------------- There is some [support for this already|https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/batch/index.html#read-compressed-files]. I would expect output compression to be similar to reading compressed input (see {{InflaterInputStreamFactory}}) and parallelism is not an issue. Is this a feature you would like to work on? > Input readers and output writers/formats need to support gzip > ------------------------------------------------------------- > > Key: FLINK-6185 > URL: https://issues.apache.org/jira/browse/FLINK-6185 > Project: Flink > Issue Type: Bug > Components: Core > Affects Versions: 1.2.0 > Reporter: Luke Hutchison > Priority: Minor > > File sources (such as {{ExecutionEnvironment#readCsvFile()}}) and sinks (such > as {{FileOutputFormat}} and its subclasses, and methods such as > {{DataSet#writeAsText()}}) need the ability to transparently decompress and > compress files. Primarily gzip would be useful, but it would be nice if this > were pluggable to support bzip2, xz, etc. > There could be options for autodetect (based on file extension and/or file > content), which could be the default, as well as no compression or a selected > compression method. -- This message was sent by Atlassian JIRA (v6.3.15#6346)