[GitHub] [kafka] cadonna commented on a change in pull request #10798: KAFKA-9168: Adding direct byte buffer support to rocksdb state store

GitBox Fri, 04 Jun 2021 01:13:03 -0700


cadonna commented on a change in pull request #10798:
URL: https://github.com/apache/kafka/pull/10798#discussion_r644552319




##########
File path: 
streams/src/main/java/org/apache/kafka/streams/state/internals/RocksDBStore.java
##########
@@ -505,6 +506,14 @@ private void closeOpenIterators() {
         }
     }
 
+    private ByteBuffer createDirectByteBufferAndPut(byte[] bytes) {
+        ByteBuffer directBuffer = ByteBuffer.allocateDirect(bytes.length);

Review comment:
       Yes, I had the same thought as @guozhangwang, but I am also not familiar 
with direct buffers.

##########
File path: 
streams/src/main/java/org/apache/kafka/streams/state/internals/RocksDBStore.java
##########
@@ -505,6 +506,14 @@ private void closeOpenIterators() {
         }
     }
 
+    private ByteBuffer createDirectByteBufferAndPut(byte[] bytes) {
+        ByteBuffer directBuffer = ByteBuffer.allocateDirect(bytes.length);

Review comment:
       The 
[javadocs](https://docs.oracle.com/javase/8/docs/api/java/nio/ByteBuffer.html) 
say
   
   > The buffers returned by this method typically have somewhat higher 
allocation and deallocation costs than non-direct buffers.
   
   So it seems allocating a direct buffer during each put might decrease 
performance. The internal benchmarks we run on this PR did not show any 
increase in throughput, but rather a decrease. However, I do not think the 
decrease was significant, so the cause could also just be a bad day of the 
environment. Anyways, continuously allocating a direct buffer does not seem to 
be a good idea, as the javadocs also say:
   
   > It is therefore recommended that direct buffers be allocated primarily for 
large, long-lived buffers that are subject to the underlying system's native 
I/O operations. In general it is best to allocate direct buffers only when they 
yield a measureable gain in program performance. 
      
   Another thing to consider:
   
   > The contents of direct buffers may reside outside of the normal 
garbage-collected heap, and so their impact upon the memory footprint of an 
application might not be obvious.

##########
File path: 
streams/src/main/java/org/apache/kafka/streams/state/internals/RocksDBStore.java
##########
@@ -505,6 +506,14 @@ private void closeOpenIterators() {
         }
     }
 
+    private ByteBuffer createDirectByteBufferAndPut(byte[] bytes) {
+        ByteBuffer directBuffer = ByteBuffer.allocateDirect(bytes.length);

Review comment:
       The only case I can think of that might have high concurrency on RocksDB 
state store is with interactive queries. Without interactive queries there is 
no concurrency on the state stores since only the stream thread that has 
assigned the stateful task owning the state store accesses the state store.

##########
File path: 
streams/src/main/java/org/apache/kafka/streams/state/internals/RocksDBStore.java
##########
@@ -505,6 +506,14 @@ private void closeOpenIterators() {
         }
     }
 
+    private ByteBuffer createDirectByteBufferAndPut(byte[] bytes) {
+        ByteBuffer directBuffer = ByteBuffer.allocateDirect(bytes.length);

Review comment:
       Yes, I think you should try to benchmarks 
putAll()/range/reverseRange/prefixSeek operations as you proposed with a simple 
Kafka Streams app. That would be great to better understand the potential of 
direct buffers for Kafka Streams. Maybe experiment also with different key and 
value sizes.

##########
File path: 
streams/src/main/java/org/apache/kafka/streams/state/internals/RocksDBStore.java
##########
@@ -505,6 +506,14 @@ private void closeOpenIterators() {
         }
     }
 
+    private ByteBuffer createDirectByteBufferAndPut(byte[] bytes) {
+        ByteBuffer directBuffer = ByteBuffer.allocateDirect(bytes.length);

Review comment:
       Yes, I think you should try to experiment with 
putAll()/range/reverseRange/prefixSeek operations as you proposed with a simple 
Kafka Streams app. That would be great to better understand the potential of 
direct buffers for Kafka Streams. Maybe experiment also with different key and 
value sizes. I am curious if we will also get such improvements.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [kafka] cadonna commented on a change in pull request #10798: KAFKA-9168: Adding direct byte buffer support to rocksdb state store

Reply via email to