Re: Impact of RocksDB backend on the Java heap

2024-02-19 Thread Alexis Sarda-Espinosa
Hi Zakelly, Yeah that makes sense to me, I was just curious about whether reading could be a bottleneck or not, but I imagine user-specific logic would be better than a generic cache from Flink that might habe a low hit rate. Thanks again, Alexis. On Mon, 19 Feb 2024, 07:29 Zakelly Lan, wrote:

Re: Impact of RocksDB backend on the Java heap

2024-02-18 Thread Zakelly Lan
Hi Alexis, Assuming the bulk load for a batch of sequential keys performs better than accessing them one by one, the main problem comes to do we really need to access all the keys that were bulk-loaded to cache before. In other words, cache hit rate is the key issue. If the rate is high, even thou

Re: Impact of RocksDB backend on the Java heap

2024-02-18 Thread Alexis Sarda-Espinosa
Hi Zakelly, thanks for the information, that's interesting. Would you say that reading a subset from RocksDB is fast enough to be pretty much negligible, or could it be a bottleneck if the state of each key is "large"? Again assuming the number of distinct partition keys is large. Regards, Alexis

Re: Impact of RocksDB backend on the Java heap

2024-02-17 Thread Zakelly Lan
Hi Alexis, Flink does need some heap memory to bridge requests to rocksdb and gather the results. In most cases, the memory is discarded immediately (eventually collected by GC). In case of timers, flink do cache a limited subset of key-values in heap to improve performance. In general you don't

Re: Impact of RocksDB backend on the Java heap

2024-02-15 Thread Asimansu Bera
Hello Alexis, I don't think data in RocksDB resides in JVM even with function calls. For more details, check the link below: https://github.com/facebook/rocksdb/wiki/RocksDB-Overview#3-high-level-architecture RocksDB has three main components - memtable, sstfile and WAL(not used in Flink as Flin

Re: Impact of RocksDB backend on the Java heap

2024-02-15 Thread Alexis Sarda-Espinosa
Hi Asimansu The memory RocksDB manages is outside the JVM, yes, but the mentioned subsets must be bridged to the JVM somehow so that the data can be exposed to the functions running inside Flink, no? Regards, Alexis. On Thu, 15 Feb 2024, 14:06 Asimansu Bera, wrote: > Hello Alexis, > > RocksDB

Re: Impact of RocksDB backend on the Java heap

2024-02-15 Thread Asimansu Bera
Hello Alexis, RocksDB resides off-heap and outside of JVM. The small subset of data ends up on the off-heap in the memory. For more details, check the following link: https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/deployment/memory/mem_setup_tm/#managed-memory I hope this addres

Impact of RocksDB backend on the Java heap

2024-02-15 Thread Alexis Sarda-Espinosa
Hello, Most info regarding RocksDB memory for Flink focuses on what's needed independently of the JVM (although the Flink process configures its limits and so on). I'm wondering if there are additional special considerations with regards to the JVM heap in the following scenario. Assuming a key u