cadonna commented on a change in pull request #10798: URL: https://github.com/apache/kafka/pull/10798#discussion_r644552319
########## File path: streams/src/main/java/org/apache/kafka/streams/state/internals/RocksDBStore.java ########## @@ -505,6 +506,14 @@ private void closeOpenIterators() { } } + private ByteBuffer createDirectByteBufferAndPut(byte[] bytes) { + ByteBuffer directBuffer = ByteBuffer.allocateDirect(bytes.length); Review comment: Yes, I had the same thought as @guozhangwang, but I am also not familiar with direct buffers. ########## File path: streams/src/main/java/org/apache/kafka/streams/state/internals/RocksDBStore.java ########## @@ -505,6 +506,14 @@ private void closeOpenIterators() { } } + private ByteBuffer createDirectByteBufferAndPut(byte[] bytes) { + ByteBuffer directBuffer = ByteBuffer.allocateDirect(bytes.length); Review comment: The [javadocs](https://docs.oracle.com/javase/8/docs/api/java/nio/ByteBuffer.html) say > The buffers returned by this method typically have somewhat higher allocation and deallocation costs than non-direct buffers. So it seems allocating a direct buffer during each put might decrease performance. The internal benchmarks we run on this PR did not show any increase in throughput, but rather a decrease. However, I do not think the decrease was significant, so the cause could also just be a bad day of the environment. Anyways, continuously allocating a direct buffer does not seem to be a good idea, as the javadocs also say: > It is therefore recommended that direct buffers be allocated primarily for large, long-lived buffers that are subject to the underlying system's native I/O operations. In general it is best to allocate direct buffers only when they yield a measureable gain in program performance. Another thing to consider: > The contents of direct buffers may reside outside of the normal garbage-collected heap, and so their impact upon the memory footprint of an application might not be obvious. ########## File path: streams/src/main/java/org/apache/kafka/streams/state/internals/RocksDBStore.java ########## @@ -505,6 +506,14 @@ private void closeOpenIterators() { } } + private ByteBuffer createDirectByteBufferAndPut(byte[] bytes) { + ByteBuffer directBuffer = ByteBuffer.allocateDirect(bytes.length); Review comment: The only case I can think of that might have high concurrency on RocksDB state store is with interactive queries. Without interactive queries there is no concurrency on the state stores since only the stream thread that has assigned the stateful task owning the state store accesses the state store. ########## File path: streams/src/main/java/org/apache/kafka/streams/state/internals/RocksDBStore.java ########## @@ -505,6 +506,14 @@ private void closeOpenIterators() { } } + private ByteBuffer createDirectByteBufferAndPut(byte[] bytes) { + ByteBuffer directBuffer = ByteBuffer.allocateDirect(bytes.length); Review comment: Yes, I think you should try to benchmarks putAll()/range/reverseRange/prefixSeek operations as you proposed with a simple Kafka Streams app. That would be great to better understand the potential of direct buffers for Kafka Streams. Maybe experiment also with different key and value sizes. ########## File path: streams/src/main/java/org/apache/kafka/streams/state/internals/RocksDBStore.java ########## @@ -505,6 +506,14 @@ private void closeOpenIterators() { } } + private ByteBuffer createDirectByteBufferAndPut(byte[] bytes) { + ByteBuffer directBuffer = ByteBuffer.allocateDirect(bytes.length); Review comment: Yes, I think you should try to experiment with putAll()/range/reverseRange/prefixSeek operations as you proposed with a simple Kafka Streams app. That would be great to better understand the potential of direct buffers for Kafka Streams. Maybe experiment also with different key and value sizes. I am curious if we will also get such improvements. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org