Re: RFC: regarding metaspace(metadata?) dump

Yi Yang Wed, 11 Jan 2023 18:53:21 -0800
Hi Ioi,
> I think there are overlaps between your proposal and existing tools. For 
> example, there are jcmd options such as VM.class_hierarchy and VM.classes, 
> etc.
> The Serviceability Agent can also be used to analyze the contents of the 
> class metadata.
Of course, we can continue to add jcmd commands such as jcmd VM.method_counter 
and jcmd VM.aggregtate_by_class_package to help diagnosing, but another once 
and for all solution is to implement a rich and well-formed metadata dump as 
this proposal described, third-party parsers and platforms are eligible to 
analyze well-formed dump file and provide many grouping/filtering 
options(grouping_by_package, filter_linked, filter_force_inline, essentially 
VM.class_hierarchy is aggregation of VM.classes).
I'm trying to describe a real use case to illustrate benefits of well-formed 
metaspace dump: In our internal DevOps platform, I observed that the Metaspace 
utilization rate of my application has been high. During this period, FGC 
occurred several times. So I generate a well-formed metaspace dump through 
DevOps platform, and then the dump file will be automatically generated and 
uploaded to another internal Java troubleshooting platform, troubleshooting 
platform further analyzes and show it with many grouping and filter options and 
so on.
> I'd be interested in seeing your implementation and compare it with the 
> existing tools.
I'm starting to do this, and it may take several months to implement since it 
looks more like a JEP level feature, I want to hear some general discussion 
before coding, i.e, is it acceptable to use JSON format? should it be Metadata 
Dump or keeping the current metaspace scope? Do you think basic+extend output 
for internal structure is acceptable?
> This may be quite difficult, because the metadata contains rewritten Java 
> bytecodes. The rewriting format may be dependent on the JDK version. Also, 
> the class linkage (the resolution of constant pool information) will be 
> vastly from one JDK version to another. So using writing a third party tool 
> that can work with multiple JDK versions will be quite hard.
Thanks for your input! Maybe display rewrited bytecodes? Anyway, I'll take a 
close look at this, and I'll prepare a POC along with dump parser and a simple 
UI diagnose web once ready.
> Also, defining a "portable" format for the dump will be difficult, since we 
> don't know how the internal data structure will evolve in the future.
Yes, since we don't know how internal data structure will changed in the 
future, so I propose reaching a consensus that we can at least reconstruct Java 
(rewrited?) source code as much as possible. For example, the dumped JSON 
object for InstanceKlass contains two parts, the first part contains the 
necessary information to reconstruct the source code as much as possible, and 
the second part is extended information, like this:
{
 name:..,
 super:..,
 flags:...,
 method:[]
 interface:[]
 fields:[],
 annotation:[]
 bytecode:[],
 constantpool:[],
 //extend
 init_state:...,
 init_thread:...,
}
The first part is basically unchanged(or adding new fields only), and the 
extended part is subject to change, visualization dump client checks if fields 
of JSON objects are defined and displays them further.
------------------------------------------------------------------
From:Ioi Lam <ioi....@oracle.com>
Send Time:2023 Jan. 12 (Thu.) 08:15
To:hotspot-runtime-dev <hotspot-runtime-...@openjdk.org>; 
serviceability-...@openjdk.java.net <serviceability-...@openjdk.java.net>
Subject:Re: RFC: regarding metaspace(metadata?) dump
 CC-ing serviceability.
 Hi Yi,
 In general, I think it's good to have tools for understanding the internal 
layout of the class metadata layouts.
 I think there are overlaps between your proposal and existing tools. For 
example, there are jcmd options such as VM.class_hierarchy and VM.classes, etc.
 The Serviceability Agent can also be used to analyze the contents of the class 
metadata.
 Dd you look at the existing tools and see how they match up with your 
requirements?
 I'd be interested in seeing your implementation and compare it with the 
existing tools.
On 1/11/2023 4:56 AM, Yi Yang wrote:
Hi,
Internally, we often receive feedback from users and ask for help on 
metaspace-related issues, for example
1. Users are eager to know which GroovyClassLoader loads which classes, why 
they are not unloaded,
and why they are leading to Metaspace OOME.
2. They want to know the class structure of dynamically generated classes in 
some scenarios such as 
deserialization
3. Finding memory leaking about duplicated classes
...
Internally we implemented a metaspace dump that generates human-readable text, 
it looks something like this:
[Basic Information]
Dump Reason : JCMD
MaxMetaspaceSize : 18446744073709547520 B
CompressedClassSpaceSize : 1073741824 B
Class Space Used : 309992 B
Class Space Capacity : 395264 B
...
[Class Loader Data]
ClassLoaderData : loader = 0x000000008024f928, loader_klass = 
0x0000000800010098, loader_klass_name = 
sun/misc/Launcher$AppClassLoader, label = N/A
 Class Used Chunks :
 * Chunk : [0x0000000800060000, 0x0000000800060230, 0x0000000800060800)
 NonClass Used Chunks :
 * Chunk : [0x00007fd8379c1000, 0x00007fd8379c1350, 0x00007fd8379c2000)
 Klasses :
 Klass : 0x0000000800060028, name = Test, size = 520 B
 ConstantPool : 0x00007fd8379c1050, size = 296 B
...
It has been working effectively for several years and has helped many users 
solve metaspace-related problems.
But a more user-friendly way is that JDK can inherently support this 
capability. We hope that format of the metaspace
dump file can take both flexibility and compatibility into account, and the 
content of dump file should be detailed
enough to meet the needs of both application developers and lower-level 
developers.
Based on above considerations, I think using JSON as its file format is an 
appropriate solution(But XML or binary 
format are still not excluded as candidates). Specifically, in earlier 
thoughts, I thought the format of the metaspace
file could be as follows(pretty printed)
https://gist.github.com/y1yang0/ab3034b6381b8a9d215602c89af4e9c3 
<https://gist.github.com/y1yang0/ab3034b6381b8a9d215602c89af4e9c3 >
Using the JSON format, we can flexibly add new fields without breaking 
compatibility. It is debatable as to which data
to write. We can reach a consensus that third-party parsers(Metaspace Analyzer 
Tool) can at least reconstruct Java
source code from the dump file. 
 This may be quite difficult, because the metadata contains rewritten Java 
bytecodes. The rewriting format may be dependent on the JDK version. Also, the 
class linkage (the resolution of constant pool information) will be vastly from 
one JDK version to another. So using writing a third party tool that can work 
with multiple JDK versions will be quite hard. Also, defining a "portable" 
format for the dump will be difficult, since we don't know how the internal 
data structure will evolve in the future.
 Thanks
 - Ioi
Based on this, we can write more useful information for low-level 
troubleshooting
or debugging. (e.g. the init_state of InstanceKlass).
 In addition, we can even output the native code and associated information 
with regard to Method, third-party parser
 can reconstruct the human-readable assembly representation of the compiled 
method based on dump file. To some extent,
we have implemented code cache dump by the way. For this reason, I'm not sure 
if the title of the RFC proposal should
be called metaspace dump, maybe metadata dump? It looks more like a 
metadata-dump framework.
Do you have any thoughts about metaspace/metadata dump? Looking forward to 
hearing your feedback, any comments are invaluable!
Best regards,
Yi Yang
Re: RFC: regarding metaspace(metadata?) dump

Reply via email to