CC-ing serviceability.
Hi Yi,
In general, I think it's good to have tools for understanding the
internal layout of the class metadata layouts.
I think there are overlaps between your proposal and existing tools. For
example, there are jcmd options such as VM.class_hierarchy and
VM.classes, etc.
The Serviceability Agent can also be used to analyze the contents of the
class metadata.
Dd you look at the existing tools and see how they match up with your
requirements?
I'd be interested in seeing your implementation and compare it with the
existing tools.
On 1/11/2023 4:56 AM, Yi Yang wrote:
Hi,
Internally, we often receive feedback from users and ask for help on
metaspace-related issues, for example
1. Users are eager to know which GroovyClassLoader loads which
classes, why they are not unloaded,
and why they are leading to Metaspace OOME.
2. They want to know the class structure of dynamically generated
classes in some scenarios such as
deserialization
3. Finding memory leaking about duplicated classes
...
Internally we implemented a metaspace dump that generates
human-readable text, it looks something like this:
[Basic Information]
Dump Reason : JCMD
MaxMetaspaceSize : 18446744073709547520 B
CompressedClassSpaceSize : 1073741824 B
Class Space Used : 309992 B
Class Space Capacity : 395264 B
...
[Class Loader Data]
ClassLoaderData : loader = 0x000000008024f928, loader_klass =
0x0000000800010098, loader_klass_name =
sun/misc/Launcher$AppClassLoader, label = N/A
Class Used Chunks :
* Chunk : [0x0000000800060000, 0x0000000800060230, 0x0000000800060800)
NonClass Used Chunks :
* Chunk : [0x00007fd8379c1000, 0x00007fd8379c1350, 0x00007fd8379c2000)
Klasses :
Klass : 0x0000000800060028, name = Test, size = 520 B
ConstantPool : 0x00007fd8379c1050, size = 296 B
...
It has been working effectively for several years and has helped many
users solve metaspace-related problems.
But a more user-friendly way is that JDK can inherently support this
capability. We hope that format of the metaspace
dump file can take both flexibility and compatibility into account,
and the content of dump file should be detailed
enough to meet the needs of both application developers and
lower-level developers.
Based on above considerations, I think using JSON as its file format
is an appropriate solution(But XML or binary
format are still not excluded as candidates). Specifically, in earlier
thoughts, I thought the format of the metaspace
file could be as follows(pretty printed)
https://gist.github.com/y1yang0/ab3034b6381b8a9d215602c89af4e9c3
Using the JSON format, we can flexibly add new fields without breaking
compatibility. It is debatable as to which data
to write. We can reach a consensus that third-party parsers(Metaspace
Analyzer Tool) can at least reconstruct Java
source code from the dump file.
This may be quite difficult, because the metadata contains rewritten
Java bytecodes. The rewriting format may be dependent on the JDK
version. Also, the class linkage (the resolution of constant pool
information) will be vastly from one JDK version to another. So using
writing a third party tool that can work with multiple JDK versions will
be quite hard. Also, defining a "portable" format for the dump will be
difficult, since we don't know how the internal data structure will
evolve in the future.
Thanks
- Ioi
Based on this, we can write more useful information for low-level
troubleshooting
or debugging. (e.g. the init_state of InstanceKlass).
In addition, we can even output the native code and associated
information with regard to Method, third-party parser
can reconstruct the human-readable assembly representation of the
compiled method based on dump file. To some extent,
we have implemented code cache dump by the way. For this reason, I'm
not sure if the title of the RFC proposal should
be called metaspace dump, maybe metadata dump? It looks more like a
metadata-dump framework.
Do you have any thoughts about metaspace/metadata dump? Looking
forward to hearing your feedback, any comments are invaluable!
Best regards,
Yi Yang