[ https://issues.apache.org/jira/browse/FLINK-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16183002#comment-16183002 ]
ASF GitHub Bot commented on FLINK-5944: --------------------------------------- Github user mlipkovich commented on a diff in the pull request: https://github.com/apache/flink/pull/4683#discussion_r141424281 --- Diff: flink-core/pom.xml --- @@ -52,6 +52,12 @@ under the License. <artifactId>flink-shaded-asm</artifactId> </dependency> + <dependency> + <groupId>org.apache.flink</groupId> + <artifactId>flink-shaded-hadoop2</artifactId> + <version>${project.version}</version> + </dependency> --- End diff -- Thanks for your comment Aljoscha, So there are at least three ways on how to achieve it: either mark this dependency as 'provided', move Hadoop Snappy Codec related classes to flink-java module or move it to some separate module as suggested @haohui, but I'm not sure what should be inside this module > Flink should support reading Snappy Files > ----------------------------------------- > > Key: FLINK-5944 > URL: https://issues.apache.org/jira/browse/FLINK-5944 > Project: Flink > Issue Type: New Feature > Components: Batch Connectors and Input/Output Formats > Reporter: Ilya Ganelin > Assignee: Mikhail Lipkovich > Labels: features > > Snappy is an extremely performant compression format that's widely used > offering fast decompression/compression. > This can be easily implemented by creating a SnappyInflaterInputStreamFactory > and updating the initDefaultInflateInputStreamFactories in FileInputFormat. > Flink already includes the Snappy dependency in the project. > There is a minor gotcha in this. If we wish to use this with Hadoop, then we > must provide two separate implementations since Hadoop uses a different > version of the snappy format than Snappy Java (which is the xerial/snappy > included in Flink). -- This message was sent by Atlassian JIRA (v6.4.14#64029)