Re: GC.heap_dump performance regression in Java 21

Frederic Parain Fri, 06 Oct 2023 12:03:03 -0700

Hi Hannes,


Thank you for the analysis and the proposed solution.

The changes look reasonable to me, and I agree with Alex that we shouldeither fix or get rid of the old FieldStream implementation. If we keepit, this kind of performance issue will happen again.



Regards,


Fred


On 10/2/23 10:00 PM, Alex Menkov wrote:

Hi Hannes,

The change looks very reasonable to me.
Field order is not important in the heap dump (the order just shouldbe the same in class and instance subrecords).
And I think it would be better to fix original FieldStream (orintroduce new HierarchicalFieldStream and use in for heap dumpingfirst and then switch JVMTI to use it). This should improveperformance of JVMTI functions as well.
AFAICS JVMTI uses FieldStream/FilteredFieldStream in 2 places:GetClassFields and heap walking functions.GetClassFields needs fields in the order they occur in the class fileand it has to reverse the order returned by FieldStream, so switch itto use forward field stream is straightforward.For heap walking functions field index is calculated trickier (itincludes count of fields in superclasses/interfaces).
We don't have good test coverage for heap dumping, there are somebasic tests in test/hotspot/jtreg/serviceability.
regards,
Alex

On 02/10/23 11:49, Hannes Greule wrote:
Hi,
recently, a performance regression of jcmd GC.heap_dump was broughtto my attention. I investigated the regression and tracked downhttps://bugs.openjdk.org/browse/JDK-8292818 as the source of it.For reproduction, I used the code at [1] and ran it with `java -Xmx2GCountPrimes`.In Java 17, jcmd CountPrimes GC.heap_dump -overwrite heap.hproffinishes in 2-3 seconds. In Java 21, it almost takes 20 seconds instead.
Further analysis showed that the functions in InstanceKlass to getthe access flags of a field (identified by its index) now requires aniteration of the fields. As FieldStream from reflectionUtils.hppaccesses such data through the InstanceKlass with a given fieldindex, this results in quadratic complexity for each object that getsdumped.
I wrote a fix for this, with which it seems to finish even fasterthan before the regression.
Before opening a Pull Request for it, however, I would like to knowif this change is even feasible.Based on the implementation in fieldStreams, I built a class`HierarchicalFieldStream` to stream over fields of allInstanceKlasses in a hiararchy, similar to how `FieldStream` inreflectionUtils is implemented already.The most significant difference is that the `FieldStream` fromreflectionUtils iterates fields backwards, while the`JavaFieldStream` from fieldStreams iterates forwards. That meansusing the `JavaFieldStream` and my `HierarchicalFieldStream` directlyresults in different heap dumps as the fields are dumped in theirencounter order. From what I've found, this order isn't specified.The order in which super types are visited remains the same.
Is this an acceptable change?
I decided against changing the implementation of `FieldStream` fromreflectionUtils as it is used in JVMTI code too.
You can find my suggested implementation at [2].
Please let me know what you think about it, and also let me know ifthere are any relevant tests that I should run that don't run in GHAalready.
If you agree with my changes, I will open a bug report and create a PR.

Thanks,
Hannes

[1] https://gist.github.com/SirYwell/73d8e3d679e5aa49a11ebefc868b4404
[2]https://github.com/SirYwell/jdk/commit/9814ca2aea8ebd7400e256b7430d3961a3692a83

Re: GC.heap_dump performance regression in Java 21

Reply via email to