Sorry resending with proper formatting.. yahoo mail defaults to rich text...and 
that messes up formatting on this mailing list.


Based on my (streaming mode) experiments I see that its not simply on heap and 
off heap memory. There are actually 3 divisions of memory:

1- On heap  (-Xmx) . 
2- Off heap - (DirectByteBuffers allocations ... NetwkBuff, Netty, JVMMetadata, 
optionally the TM managed mem)
3- and The container "cut off" part (0.3 in my example)

The cut off ratio controls what is left over for the  1 & 2. Thereafter the 
other off-heap reservations dictate what is left over for on-heap.

Obviously RocksDB mem is not on-heap. 
My intuition is that RocksDB mem might fall into the "cut off" section + off 
heap. However that depends on whether or not Flink+Netty fully pre-allocate 
whatever is reserved for the off-heap memory before RocksDB spins up.  If they 
do preallocate, then RocksDB native allocations will fall into 3 only.

If cut off is not used by anything.. I cant think of good reason for having 
such a high reservation (default 25%) in every container being totally unused.

I don't see any easy way to 
  a- Confirm where RocksDB mem (i,e in 2 or 3 or in 2&3) 
  b- Rough estimate for the amt of mem RDB needs for a certain MB or GB of data 
that I need to host in it
  c- determine how to tune 1 &2 & 3 to ensure the RDB gets enough memory 
without randomly crashing the job

Unfortunately coverage of this mem division is only briefly given in some of 
the unofficial presentations on youtube ... but  appears to be inaccurate.


-roshan



Reply via email to