[
https://issues.apache.org/jira/browse/AVRO-2245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16674268#comment-16674268
]
ASF GitHub Bot commented on AVRO-2245:
--------------------------------------
jacobtolar commented on a change in pull request #351: [AVRO-2245] Improve java
tests for compression codecs
URL: https://github.com/apache/avro/pull/351#discussion_r230573786
##########
File path: lang/java/avro/src/test/java/org/apache/avro/file/TestAllCodecs.java
##########
@@ -0,0 +1,96 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.avro.file;
+
+import org.junit.Assert;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.util.Arrays;
+import java.util.Collection;
+
+import static org.junit.Assert.assertTrue;
+
+@RunWith(Parameterized.class)
+public class TestAllCodecs {
+
+ @Parameterized.Parameters(name = "{index}: codec={0}")
+ public static Collection<Object[]> data() {
+ return Arrays.asList(new Object[][] {
+ { "bzip2", BZip2Codec.class },
+ { "zstandard", ZstandardCodec.class },
+ { "null", NullCodec.class },
+ { "xz", XZCodec.class },
+ { "snappy", SnappyCodec.class },
+ { "deflate", DeflateCodec.class },
+ });
+ }
+
+ @Parameterized.Parameter(0)
+ public String codec;
+
+ @Parameterized.Parameter(1)
+ public Class<? extends Codec> codecClass;
+
+
+ @Test
+ public void testCodec() throws IOException {
+ int inputSize = 500_000;
+
+ byte[] input = generateTestData(inputSize);
+
+ Codec codecInstance = CodecFactory.fromString(codec).createInstance();
+ assertTrue(codecClass.isInstance(codecInstance));
+ assertTrue(codecInstance.getName().equals(codec));
+
+ ByteBuffer inputByteBuffer = ByteBuffer.wrap(input);
Review comment:
Yes, that's true. Although I'd argue that this test wasn't really
successfully testing *anything* before. (The test doesn't even clearly verify
AVRO-1326, although it sort of does...it would be much more straightforward to
allocate a larger buffer and call `rewind` and `limit`. Instead it's actually
compressing the second (empty) half of the ByteBuffer. If we allocated
`inputSize * 2 + 1` the `assert(inputSize == outputSize)` check would fail.) I
honestly can't tell if that assertion passes out of lucky coincidence or if the
test was written to be confusing. I think the first. :)
Anyways: there's actually still a couple little bugs in that code similar to
what AVRO-1326 fixes. (With how the compression code is called currently, I
don't think these bugs would ever get exercised, but still...they should
probably get fixed). There's another PR here:
https://github.com/apache/avro/pull/352 to fix those.
As part of that PR I was going to add another test in `TestAllCodecs` to
verify the changes. Without the changes in #352 the test fails, so, not adding
it here; I'll add it there once this is merged. (Basically, this tests that we
do the right thing for both compression and decompression, even if we start out
in somewhere in the middle of the input bytebuffer).
```java
@Test
public void testCodecSlice() throws IOException {
int inputSize = 500_000;
byte[] input = generateTestData(inputSize);
Codec codecInstance = CodecFactory.fromString(codec).createInstance();
ByteBuffer partialBuffer = ByteBuffer.wrap(input);
partialBuffer.position(17);
ByteBuffer inputByteBuffer = partialBuffer.slice();
ByteBuffer compressedBuffer = codecInstance.compress(inputByteBuffer);
int compressedSize = compressedBuffer.remaining();
// Make sure something returned
assertTrue(compressedSize > 0);
// Create a slice from the compressed buffer
ByteBuffer sliceBuffer = ByteBuffer.allocate(compressedSize + 100);
sliceBuffer.position(50);
sliceBuffer.put(compressedBuffer);
sliceBuffer.limit(compressedSize + 50);
sliceBuffer.position(50);
// Decompress the data
ByteBuffer decompressedBuffer =
codecInstance.decompress(sliceBuffer.slice());
// Validate the the input and output are equal.
inputByteBuffer.rewind();
Assert.assertEquals(decompressedBuffer, inputByteBuffer);
}
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Java codec testing improvements
> -------------------------------
>
> Key: AVRO-2245
> URL: https://issues.apache.org/jira/browse/AVRO-2245
> Project: Avro
> Issue Type: Improvement
> Components: java
> Reporter: Jacob Tolar
> Priority: Minor
>
> In the Avro Java implementation, TestBZip2Codec and TestZstandardCodec are
> both laughably wrong.
> For example, the last lines of TestBZip2Codec:
> {code:java}
> byte[] outputByteArray = decompressedBuffer.array();
> for (int i = 0; i < inputByteSize; i++) {
> inputByteArray[i] = outputByteArray[i];
> }
> {code}
> There should be an assertEquals in there not an assignment statement. (And if
> you put assertEquals there, the test actually fails...).
> I will send a PR that replaces these tests with a correct parametrized test
> for all codecs.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)