Igloo created HDFS-15445: ---------------------------- Summary: ZStandardCodec compression mail fail when encounter specific file Key: HDFS-15445 URL: https://issues.apache.org/jira/browse/HDFS-15445 Project: Hadoop HDFS Issue Type: Bug Components: hdfs Affects Versions: 2.6.5 Environment: zstd 1.3.3
hadoop 2.6.5 --- a/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/zstd/TestZStandardCompressorDecompressor.java +++ b/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/zstd/TestZStandardCompressorDecompressor.java @@ -62,10 +62,8 @@ @BeforeClass public static void beforeClass() throws Exception { CONFIGURATION.setInt(IO_FILE_BUFFER_SIZE_KEY, 1024 * 64); - uncompressedFile = new File(TestZStandardCompressorDecompressor.class - .getResource("/zstd/test_file.txt").toURI()); - compressedFile = new File(TestZStandardCompressorDecompressor.class - .getResource("/zstd/test_file.txt.zst").toURI()); + uncompressedFile = new File("/tmp/badcase.data"); + compressedFile = new File("/tmp/badcase.data.zst"); Reporter: Igloo Attachments: badcase.data, image-2020-06-30-11-35-46-859.png, image-2020-06-30-11-39-17-861.png *Problem:* In our production environment, we put file in hdfs with zstd compressor, recently, we find that a specific file may leads to zstandard compressor failures. And we can reproduce the issue with specific file(attached file: badcase.data) *Analysis*: ZStandarCompressor use buffersize( From zstd recommended compress out buffer size) for both inBufferSize and outBufferSize !image-2020-06-30-11-35-46-859.png|width=475,height=179! but zstd indeed provides two separately recommending inputBufferSize and outputBufferSize !image-2020-06-30-11-39-17-861.png! *Workaround* One workaround, use recommended in/out buffer size provided by zstd lib. input buffer size: 1301072 (128 * 1024) ouput buffer size: 131591 -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org