Ruslan Dautkhanov created HADOOP-17231: ------------------------------------------
Summary: empty getDefaultExtension() is ignored Key: HADOOP-17231 URL: https://issues.apache.org/jira/browse/HADOOP-17231 Project: Hadoop Common Issue Type: Bug Affects Versions: 3.1.3, 3.2.0 Reporter: Ruslan Dautkhanov Use case - source files are gz-compressed but have no extensions. Attempt to auto-decompress them through {code:java} package com.my.codec.test import org.apache.hadoop.io.compress.GzipCodec class GZCodec extends GzipCodec { override def getDefaultExtension(): String = "" } {code} (notice empty getDefaultExtension ) and then setting *io.compression.codecs* to com.my.codec.test.GZCodec makes no effect Similar tests with one-character encoding for last possible names makes it work. So only the empty-string getDefaultExtension case is broken. I guess the issue is somewhere in [https://github.com/c9n/hadoop/blob/master/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/CompressionCodecFactory.java#L109] but it's not obvious. Folks have built some workarounds using custom readers, for example, # [https://daynebatten.com/2015/11/override-hadoop-compression-codec-file-extension/] # [https://stackoverflow.com/questions/52011697/how-to-read-a-compressed-gzip-file-without-extension-in-spark?rq=1] Hopefully it would be an easy fix to support empty getDefaultExtension? -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org