Good points, Andrey. Thanks for clarification. I made some minor adaptations to the FLIP now: - Renamed the `resource` member into `configuration` and made it a top-level member besides `metrics` and `hardware` since it's not fitting the volatile metrics context that well. - I restructured the table under Proposed Changes to cover Metaspace now. Additionally, I renamed `shuffle` into `network` to match the memory model of FLIP-49. - The UI in the screenshot picture has a bug: The counts of Direct and Mapped are accompanied by a memory unit even though they are plain counts.
On Thu, Aug 20, 2020 at 4:10 PM Andrey Zagrebin <azagre...@apache.org> wrote: > Hi All, > > Thanks for reviving the discussion, Matthias! > > This would mean that we could adapt the current proposal to replace the > > Nonheap usage pane by a pane displaying the Metaspace usage. > > > I do not know the value of having the Nonheap usage in metrics. I can see > that the metaspace metric can be interesting for the users to debug OOMs. > We had the Nonheap usage before, so as discussed, I would be a bit careful > removing. I believe it deserves a separate poll in user ML > whether the Nonheap usage is useless or not. > As a current solution, we could keep both or merge them into one box with a > slash, like Metaspace/Nonheap -> 5Mb/10Mb, if the majority agrees that this > is not confusing and clear that the metaspace is a part of Nonheap. > That would be a good solution representing both metrics. I adapted the table in FLIP-102's Confluence accordingly for now to have it visualized. Let's see what others are thinking about it. > > Btw, the "Nonheap" in the configuration box of the current FLIP-102 is > probably incorrect or confusing as it does not one-to-one correspond to the > Nonheap JVM metric. > > The only issue I see is that JVM Overhead would still not be represented in > > the memory usage > > overview. > > My understanding is that we do not need a usage metric for JVM Overhead as > it is a virtual unmanaged component which is more about configuring the max > total process memory. > > Is there a reason for us to introduce a nested structure > > TaskManagerMetricsInfo in the response object? I would rather keep it > > consistent in a flat structure instead, i.e. having all the members of > > TaskManagerResourceInfo being members of TaskManagerMetricsInfo > > I would suggest introducing a separate REST call for > TaskManagerResourceInfo. > Semantically, TaskManagerResourceInfo is more about the TM configuration > and it is not directly related to the usage metrics. > In future, I would avoid having calls with many responsibilities and maybe > consider splitting the 'TM details' call into metrics etc unless there is a > concern for having to do more calls instead of one from UI. > Good point. The growing size of the JSON response record might make it worth splitting it up into different endpoints serving different groups of data (e.g. /metrics for volatile values and /configuration for static ones). > > Alternatively, one could think of grouping the metrics collecting the > > different values (i.e. max, used, committed) per metric in a JSON object. > > But this would apply for all the other metrics of TaskManagerMetricsInfo > > as > > well. > > I would personally prefer this for metrics but I am not pushing for this. > > metrics.resource.managedMemory and metrics.resource.networkMemory have > > counterparts in metrics.networkMemory[Used|Total] and > > metrics.managedMemory[Used|Total]: Is this redundant data or do they have > > different semantics? > > As I understand, they have different semantics. The later is about > configuration, the former is about current usage metrics. > I see. Makes sense. > > Is metrics.resource.totalProcessMemory a basic sum over all provided > > values? > > this is again about configuration, I do not think it makes sense to come up > with a usage metric for the totalProcessMemory component. > Got it. > Best, > Andrey > > > On Thu, Aug 20, 2020 at 9:06 AM Matthias <matth...@ververica.com> wrote: > > > Hi Jing, > > I recently joined Ververica and started looking into FLIP-102. I'm trying > > to > > figure out how we would implement the proposal on the backend side. > > I looked into the proposal for the REST API response and a few questions > > popped up: > > - Is there a reason for us to introduce a nested structure > > TaskManagerMetricsInfo in the response object? I would rather keep it > > consistent in a flat structure instead, i.e. having all the members of > > TaskManagerResourceInfo being members of TaskManagerMetricsInfo. > > Alternatively, one could think of grouping the metrics collecting the > > different values (i.e. max, used, committed) per metric in a JSON object. > > But this would apply for all the other metrics of TaskManagerMetricsInfo > as > > well. > > - metrics.resource.managedMemory and metrics.resource.networkMemory have > > counterparts in metrics.networkMemory[Used|Total] and > > metrics.managedMemory[Used|Total]: Is this redundant data or do they have > > different semantics? > > - Is metrics.resource.totalProcessMemory a basic sum over all provided > > values? I see the necessity to have this member if we decide to not > provide > > the memory usage for all memory pools (e.g. providing Metaspace but > leaving > > Code Cache and Compressed Class Space as Non-Heap pools out of the > > response). Otherwise, would it be worth it to remove this member from the > > response for simplicity reasons since we could sum up the memory on the > > frontend side? > > > > Best, > > Matthias > > > > > > > > -- > > Sent from: > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ > > > -- Matthias Pohl | Engineer Follow us @VervericaData Ververica <https://www.ververica.com/> -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Yip Park Tung Jason, Jinwei (Kevin) Zhang, Karl Anton Wehner