cadonna commented on code in PR #18280: URL: https://github.com/apache/kafka/pull/18280#discussion_r1893902897
########## docs/streams/developer-guide/memory-mgmt.html: ########## @@ -176,7 +177,7 @@ BlockBasedTableConfig tableConfig = (BlockBasedTableConfig) options.tableFormatConfig(); - // These three options in combination will limit the memory used by RocksDB to the size passed to the block cache (TOTAL_OFF_HEAP_MEMORY) + // These three options in combination will limit the memory used by RocksDB for cache, indexes, and write buffers to the size passed to (TOTAL_OFF_HEAP_MEMORY) Review Comment: ```suggestion // These three options in combination will limit the memory used by RocksDB for cache, indexes, filters, and write buffers to TOTAL_OFF_HEAP_MEMORY ``` ########## docs/streams/developer-guide/memory-mgmt.html: ########## @@ -165,7 +165,8 @@ $ apt install -y libjemalloc-dev # set LD_PRELOAD before you start your Kafka Streams application $ export LD_PRELOAD="/usr/lib/x86_64-linux-gnu/libjemalloc.so"</code></pre> - <p> As of 2.3.0 the memory usage across all instances can be bounded, limiting the total off-heap memory of your Kafka Streams application. To do so you must configure RocksDB to cache the index and filter blocks in the block cache, limit the memtable memory through a shared <a class="reference external" href="https://github.com/facebook/rocksdb/wiki/Write-Buffer-Manager">WriteBufferManager</a> and count its memory against the block cache, and then pass the same Cache object to each instance. See <a class="reference external" href="https://github.com/facebook/rocksdb/wiki/Memory-usage-in-RocksDB">RocksDB Memory Usage</a> for details. An example RocksDBConfigSetter implementing this is shown below:</p> + <p> As of 2.3.0 the memory usage across all instances can be bounded, limiting the off-heap memory of your Kafka Streams application. To do so you must configure RocksDB to cache the index and filter blocks in the block cache, limit the memtable memory through a shared <a class="reference external" href="https://github.com/facebook/rocksdb/wiki/Write-Buffer-Manager">WriteBufferManager</a> and count its memory against the block cache, and then pass the same Cache object to each instance. + However, don't reserve more than 40-50% of native memory budget for the cache alone to begin with, as RocksDB needs memory for internal housekeeping, test your workload to find the optimal size. Smaller cache size does not necessary increase IO due to page cache kept by OS. See <a class="reference external" href="https://github.com/facebook/rocksdb/wiki/Memory-usage-in-RocksDB">RocksDB Memory Usage</a> for details. An example RocksDBConfigSetter implementing this is shown below:</p> Review Comment: I would not be such imperative. What about something like: ``` Find a good limit by testing with your workload. For example, start limiting the off-heap memory used by RocksDB to 40-50% of your memory budget after detracting the memory reserved for the JVM. Note that a smaller cache size does not necessarily increase I/O since the OS maintains a page cache. ``` BTW, I checked our general-purpose Kafka Streams deployments and we use around 70% of the memory after detracting the memory reserved for the JVM. As far as I know, we rarely run into OOMs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org