Github user sekruse commented on a diff in the pull request: https://github.com/apache/flink/pull/762#discussion_r31562256 --- Diff: flink-core/src/main/java/org/apache/flink/api/common/io/FileInputFormat.java --- @@ -628,9 +692,10 @@ public void open(FileInputSplit fileSplit) throws IOException { * @see org.apache.flink.api.common.io.InputStreamFSInputWrapper */ protected FSDataInputStream decorateInputStream(FSDataInputStream inputStream, FileInputSplit fileSplit) throws Throwable { - // Wrap stream in a extracting (decompressing) stream if file ends with .deflate. - if (fileSplit.getPath().getName().endsWith(DEFLATE_SUFFIX)) { - return new InflaterInputStreamFSInputWrapper(stream); + // Wrap stream in a extracting (decompressing) stream if file ends with a known compression file extension. + InflaterInputStreamFactory<?> inflaterInputStreamFactory = getInflaterInputStreamFactory(fileSplit.getPath()); + if (inflaterInputStreamFactory != null) { + return new InputStreamFSInputWrapper(inflaterInputStreamFactory.create(stream)); --- End diff -- It might also be the case that the stream was not compressed at all. It would of course be nice to react appropriately to a missing codec, but how would we know if the current input split belongs to an uncompressed file or a compressed file with an unknown codec?
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---