[ https://issues.apache.org/jira/browse/FLINK-14406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17190677#comment-17190677 ]
Matthias edited comment on FLINK-14406 at 9/10/20, 10:07 AM: ------------------------------------------------------------- I discussed different approaches with [~trohrmann]: * Using the metric system to expose the used memory ** The metric collection runs in a different thread - we have to take care of thread-safety here! ** It's only enabled if the UI is requesting it - less load on the {{TaskExecutor}} ** It's the standard way for exposing metrics of the system which might be also relevant for other components. * Collecting the allocated memory in the {{TaskExecutor}} using the {{TaskExecutor}} as an observer on each {{MemoryManager}} instance ** It's always enabled putting unnecessary load on the {{TaskExecutor}}. ** The connection between {{TaskExecutor}} and {{MemoryManager}} does not exist, yet. This would create some coupling between these two components that are not necessary. We decided to go for the approach also already proposed by [~lining] in the issue's description. We have to consider thread-safety, though. was (Author: mapohl): I discussed different approaches with [~trohrmann]: * Using the metric system to expose the used memory * The metric collection runs in a different thread - we have to take care of thread-safety here! * It's only enabled if the UI is requesting it - less load on the {{TaskExecutor}} * It's the standard way for exposing metrics of the system which might be also relevant for other components. * Collecting the allocated memory in the {{TaskExecutor}} using the {{TaskExecutor}} as an observer on each {{MemoryManager}} instance * It's always enabled putting unnecessary load on the {{TaskExecutor}}. * The connection between {{TaskExecutor}} and {{MemoryManager}} does not exist, yet. This would create some coupling between these two components that are not necessary. We decided to go for the approach also already proposed by [~lining] in the issue's description. We have to consider thread-safety, though. > Add metric for managed memory > ----------------------------- > > Key: FLINK-14406 > URL: https://issues.apache.org/jira/browse/FLINK-14406 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Metrics, Runtime / Task > Reporter: lining > Assignee: Matthias > Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > If a user wants to get memory used in time, as there's no manage memory's > metrics, it couldn't get it. > *Propose* > * add default memory type in MemoryManager > > {code:java} > public static final MemoryType DEFAULT_MEMORY_TYPE = MemoryType.OFF_HEAP; > {code} > * add getManagedMemoryTotal in TaskExecutor: > > {code:java} > public long getManagedMemoryTotal() { > return this.taskSlotTable.getAllocatedSlots().stream().mapToLong( > slot -> > slot.getMemoryManager().getMemorySizeByType(MemoryManager.DEFAULT_MEMORY_TYPE) > ).sum(); > } > {code} > > * add getManagedMemoryUsed in TaskExecutor: > > {code:java} > public long getManagedMemoryUsed() { > return this.taskSlotTable.getAllocatedSlots().stream().mapToLong( > slot -> > slot.getMemoryManager().getMemorySizeByType(MemoryManager.DEFAULT_MEMORY_TYPE) > - slot.getMemoryManager().availableMemory(MemoryManager.DEFAULT_MEMORY_TYPE) > ).sum(); > } > {code} > > * add instantiateMemoryManagerMetrics in MetricUtils > > {code:java} > public static void instantiateMemoryManagerMetrics(MetricGroup > statusMetricGroup, TaskExecutor taskExecutor) { > checkNotNull(statusMetricGroup); > MetricGroup memoryManagerGroup = > statusMetricGroup.addGroup("Managed").addGroup("Memory"); > memoryManagerGroup.<Long, Gauge<Long>>gauge("TotalCapacity", > taskExecutor::getManagedMemoryTotal); > memoryManagerGroup.<Long, Gauge<Long>>gauge("MemoryUsed", > taskExecutor::getManagedMemoryUsed); > } > {code} > * register it in TaskManagerRunner#startTaskManager > > -- This message was sent by Atlassian Jira (v8.3.4#803005)