Hi Alexis, First of all, I strongly recommend not to look into the JVM metrics. These metrics are fetched directly from JVM and do not well correspond to Flink's memory configurations. They were introduced a long time ago and are preserved mostly for compatibility. IMO, they bring more confusion than convenience. In Flink-1.12, there is a newly designed TM metrics page in the web ui, which clearly shows how the metrics correspond to Flink's memory configurations (if any).
Concerning your questions. 1. Yes, increasing framework/task off-heap memory sizes should increase the direct memory capacity. Increasing the network memory size should also do that. 2. When 'state.backend.rocksdb.memory.managed' is true, RocksDB uses managed memory. Managed memory is not measured by any JVM metrics. It's not managed by JVM, meaning that it's not limited by '-XX:MaxDirectMemorySize' and is not controlled by the garbage collectors. Thank you~ Xintong Song On Tue, Apr 13, 2021 at 7:53 PM Alexis Sarda-Espinosa < alexis.sarda-espin...@microfocus.com> wrote: > Hello, > > > > I have a Flink TM configured with taskmanager.memory.managed.size: 1372m. > There is a streaming job using RocksDB for checkpoints, so I assume some of > this memory will indeed be used. > > > > I was looking at the metrics exposed through the REST interface, and I > queried some of them: > > > > /taskmanagers/c3c960d79c1eb2341806bfa2b2d66328/metrics?get=Status.JVM.Memory.Heap.Committed,Status.JVM.Memory.NonHeap.Committed,Status.JVM.Memory.Direct.MemoryUsed > | jq > > [ > > { > > "id": "Status.JVM.Memory.Heap.Committed", > > "value": "1652031488" > > }, > > { > > "id": "Status.JVM.Memory.NonHeap.Committed", > > "value": "234291200" > 223 MiB > > }, > > { > > "id": "Status.JVM.Memory.Direct.MemoryUsed", > > "value": "375015427" > 358 MiB > > }, > > { > > "id": "Status.JVM.Memory.Direct.TotalCapacity", > > "value": "375063552" > 358 MiB > > } > > ] > > > > I presume direct memory is being used by Flink and its networking stack, > as well as by the JVM itself. To be sure: > > > > 1. Increasing "taskmanager.memory.framework.off-heap.size" or > "taskmanager.memory.task.off-heap.size" should increase > Status.JVM.Memory.Direct.TotalCapacity, right? > 2. I presume the native memory used by RocksDB cannot be tracked with > these JVM metrics even if "state.backend.rocksdb.memory.managed" is true, > right? > > > > Based on this question: > https://stackoverflow.com/questions/30622818/off-heap-native-heap-direct-memory-and-native-memory, > I imagine Flink/RocksDB either allocates memory completely independently of > the JVM, or it uses unsafe. Since the documentation ( > https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/memory/mem_setup_tm.html#managed-memory) > states that "Managed memory is managed by Flink and is allocated as native > memory (off-heap)", I thought this native memory might show up as part of > direct memory tracking, but I guess it doesn’t. > > > > Regards, > > Alexis. > > >