Hi All, Thanks for reviving the discussion, Matthias!
This would mean that we could adapt the current proposal to replace the > Nonheap usage pane by a pane displaying the Metaspace usage. > I do not know the value of having the Nonheap usage in metrics. I can see that the metaspace metric can be interesting for the users to debug OOMs. We had the Nonheap usage before, so as discussed, I would be a bit careful removing. I believe it deserves a separate poll in user ML whether the Nonheap usage is useless or not. As a current solution, we could keep both or merge them into one box with a slash, like Metaspace/Nonheap -> 5Mb/10Mb, if the majority agrees that this is not confusing and clear that the metaspace is a part of Nonheap. Btw, the "Nonheap" in the configuration box of the current FLIP-102 is probably incorrect or confusing as it does not one-to-one correspond to the Nonheap JVM metric. The only issue I see is that JVM Overhead would still not be represented in > the memory usage > overview. My understanding is that we do not need a usage metric for JVM Overhead as it is a virtual unmanaged component which is more about configuring the max total process memory. Is there a reason for us to introduce a nested structure > TaskManagerMetricsInfo in the response object? I would rather keep it > consistent in a flat structure instead, i.e. having all the members of > TaskManagerResourceInfo being members of TaskManagerMetricsInfo I would suggest introducing a separate REST call for TaskManagerResourceInfo. Semantically, TaskManagerResourceInfo is more about the TM configuration and it is not directly related to the usage metrics. In future, I would avoid having calls with many responsibilities and maybe consider splitting the 'TM details' call into metrics etc unless there is a concern for having to do more calls instead of one from UI. Alternatively, one could think of grouping the metrics collecting the > different values (i.e. max, used, committed) per metric in a JSON object. > But this would apply for all the other metrics of TaskManagerMetricsInfo > as > well. I would personally prefer this for metrics but I am not pushing for this. metrics.resource.managedMemory and metrics.resource.networkMemory have > counterparts in metrics.networkMemory[Used|Total] and > metrics.managedMemory[Used|Total]: Is this redundant data or do they have > different semantics? As I understand, they have different semantics. The later is about configuration, the former is about current usage metrics. Is metrics.resource.totalProcessMemory a basic sum over all provided > values? this is again about configuration, I do not think it makes sense to come up with a usage metric for the totalProcessMemory component. Best, Andrey On Thu, Aug 20, 2020 at 9:06 AM Matthias <matth...@ververica.com> wrote: > Hi Jing, > I recently joined Ververica and started looking into FLIP-102. I'm trying > to > figure out how we would implement the proposal on the backend side. > I looked into the proposal for the REST API response and a few questions > popped up: > - Is there a reason for us to introduce a nested structure > TaskManagerMetricsInfo in the response object? I would rather keep it > consistent in a flat structure instead, i.e. having all the members of > TaskManagerResourceInfo being members of TaskManagerMetricsInfo. > Alternatively, one could think of grouping the metrics collecting the > different values (i.e. max, used, committed) per metric in a JSON object. > But this would apply for all the other metrics of TaskManagerMetricsInfo as > well. > - metrics.resource.managedMemory and metrics.resource.networkMemory have > counterparts in metrics.networkMemory[Used|Total] and > metrics.managedMemory[Used|Total]: Is this redundant data or do they have > different semantics? > - Is metrics.resource.totalProcessMemory a basic sum over all provided > values? I see the necessity to have this member if we decide to not provide > the memory usage for all memory pools (e.g. providing Metaspace but leaving > Code Cache and Compressed Class Space as Non-Heap pools out of the > response). Otherwise, would it be worth it to remove this member from the > response for simplicity reasons since we could sum up the memory on the > frontend side? > > Best, > Matthias > > > > -- > Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ >