[GitHub] [kafka] divijvaidya commented on a diff in pull request #13135: KAFKA-14633: Reduce data copy & buffer allocation during decompression

via GitHub Mon, 23 Jan 2023 06:23:26 -0800


divijvaidya commented on code in PR #13135:
URL: https://github.com/apache/kafka/pull/13135#discussion_r1084097512



##########
clients/src/main/java/org/apache/kafka/common/compress/ZstdFactory.java:
##########
@@ -62,10 +68,11 @@ public void release(ByteBuffer buffer) {
                 }
             };
 
-            // Set output buffer (uncompressed) to 16 KB (none by default) to 
ensure reasonable performance
-            // in cases where the caller reads a small number of bytes 
(potentially a single byte).
-            return new BufferedInputStream(new ZstdInputStreamNoFinalizer(new 
ByteBufferInputStream(buffer),
-                bufferPool), 16 * 1024);
+            // We do not use an intermediate buffer to store the decompressed 
data as a result of JNI read() calls using
+            // `ZstdInputStreamNoFinalizer` here. Every read() call to 
`DataInputStream` will be a JNI call and the
+            // caller is expected to balance the tradeoff between reading 
large amount of data vs. making multiple JNI
+            // calls.
+            return new DataInputStream(new ZstdInputStreamNoFinalizer(new 
ByteBufferInputStream(buffer), bufferPool));

Review Comment:
   2. For broker, the number of JNI calls remain same because prior to this 
change, we were making JNI calls in chunks of 16KB (using BufferedInputStream) 
and now we are making JNI calls in chunks of 16KB based on decompression buffer 
size.
   
   For consumer, the number of JNI calls *will change*. Earlier, consumer was 
making multiple calls in chunks of 16KB (using BufferedInputStream) and now it 
is making one call to read the entire data. Note that consumer does not use 
"skipKeyValueIterator" variation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [kafka] divijvaidya commented on a diff in pull request #13135: KAFKA-14633: Reduce data copy & buffer allocation during decompression

Reply via email to