Re: RFR: 8312623: SA add NestHost and NestMembers attributes when dumping class

2023-08-01 Thread David Holmes
On Mon, 31 Jul 2023 15:27:36 GMT, Ashutosh Mehra  wrote:

>> @ashu-mehra you indicated that you had only done two basic manual tests to 
>> check the output. You need to check it for the cases that I flagged too. In 
>> the VM every top-level class is its own nest-host, but that is not expressed 
>> in a classfile attribute (it is just the defined semantics) so displaying 
>> this as-if it were an explicit attribute may not be right.
>
> @dholmes-ora I confirmed there is no nest-host or nest-members attributes 
> generated by this patch for a top level class which doesn't have any 
> nest-members. Is that what you wanted to verify?

@ashu-mehra That was one case. I also want to know that you have tested deeply 
nested classes; and hidden classes (if applicable). Thanks.

-

PR Comment: https://git.openjdk.org/jdk/pull/15005#issuecomment-1659831895


Re: RFR: JDK-8306441: Two phase segmented heap dump [v21]

2023-08-01 Thread Kevin Walls
On Tue, 1 Aug 2023 05:50:42 GMT, Yi Yang  wrote:

>> ### Motivation and proposal
>> Hi, heap dump brings about pauses for application's execution(STW), this is 
>> a well-known pain. JDK-8252842 have added parallel support to heapdump in an 
>> attempt to alleviate this issue. However, all concurrent threads 
>> competitively write heap data to the same file, and more memory is required 
>> to maintain the concurrent buffer queue. In experiments, we did not feel a 
>> significant performance improvement from that.
>> 
>> The minor-pause solution, which is presented in this PR, is a two-phase 
>> segmented heap dump:
>> 
>> - Phase 1(STW): Concurrent threads directly write data to multiple heap 
>> files.
>> - Phase 2(Non-STW): Merge multiple heap files into one complete heap dump 
>> file. This process can happen outside safepoint.
>> 
>> Now concurrent worker threads are not required to maintain a buffer queue, 
>> which would result in more memory overhead, nor do they need to compete for 
>> locks. The changes in the overall design are as follows:
>> 
>> ![image](https://github.com/openjdk/jdk/assets/5010047/77e4764a-62b5-4336-8b45-fc880ba14c4a)
>> Fig1. Before
>> 
>> ![image](https://github.com/openjdk/jdk/assets/5010047/931ab874-64d1-4337-ae32-3066eed809fc)
>> Fig2. After this patch
>> 
>> ### Performance evaluation
>> | memory | numOfThread | CompressionMode | STW | Total |
>> | ---| --- | --- | --- |  |
>> | 8g | 1 T | N | 15.612 | 15.612 |
>> | 8g | 32 T | N | 2.561725 | 14.498 |
>> | 8g | 32 T | C1 | 2.3084878 | 14.198 |
>> | 8g | 32 T | C2 | 10.9355128 | 21.882 |
>> | 8g | 96 T | N | 2.6790452 | 14.012 |
>> | 8g | 96 T | C1 | 2.3044796 | 3.589 |
>> | 8g | 96 T | C2 | 9.7585151 | 20.219 |
>> | 16g | 1 T | N | 26.278 | 26.278 |
>> | 16g | 32 T | N | 5.231374 | 26.417 |
>> | 16g | 32 T | C1 | 5.6946983 | 6.538 |
>> | 16g | 32 T | C2 | 21.8211105 | 41.133 |
>> | 16g | 96 T | N | 6.2445556 | 27.141 |
>> | 16g | 96 T | C1 | 4.6007096 | 6.259 |
>> | 16g | 96 T | C2 | 19.2965783 | 39.007 |
>> | 32g | 1 T | N | 48.149 | 48.149 |
>> | 32g | 32 T | N | 10.7734677 | 61.643 |
>> | 32g | 32 T | C1 | 10.1642097 | 10.903 |
>> | 32g | 32 T | C2 | 43.8407607 | 88.152 |
>> | 32g | 96 T | N | 13.1522042 | 61.432 |
>> | 32g | 96 T | C1 | 9.0954641 | 9.885 |
>> | 32g | 96 T | C2 | 38.9900931 | 80.574 |
>> | 64g | 1 T | N | 100.583 | 100.583 |
>> | 64g | 32 T | N | 20.9233744 | 134.701 |
>> | 64g | 32 T | C1 | 18.5023784 | 19.358 |
>> | 64g | 32 T | C2 | 86.4748377 | 172.707 |
>> | 64g | 96 T | N | 26.7374116 | 126.08 |
>> | 64g | ...
>
> Yi Yang has updated the pull request incrementally with one additional commit 
> since the last revision:
> 
>   test failure on mac

Thanks for the update -
Right, on the CSR I may have followed the links too far and been comparing with 
jmap features, not jcmd features. 8-)



On the default behaviour, running this change, with no -parallel= option, I 
have been seeing it default to a non-parallel dump, but now I see why:

I see num_dump_thread being set (e.g. 12 on my test system), but I see 
num_active_workers = 1 (requested is 12).

I just saw a run with no -parallel option which got num_active_workers = 2, so 
it did a parallel dump, so the default is being set as intended, but the 
availability of worker threads is not always predictable!

-Xlog:gc shows the GC was using fewer workers than expected.

So that suggested -XX:-UseDynamicNumberOfGCThreads and yes, with that I see 
e.g. num_active_workers = 23. req_num_dump_thread = 12 for my test.  And all 12 
threads work on the heap dump.


---

test/hotspot/jtreg/serviceability/dcmd/gc/HeapDumpTest.java:
we can't see if parallel dumping is happening
Can we add, like in I think another test: run testng/othervm -Xlog:heapdump 
HeapDumpTest

I would also suggest -XX:-UseDynamicNumberOfGCThreads otherwise as above, the 
GC can scale back and you don't always get a parallel dump when you expect it.


---

Do we really want -parallel=-1 
..to be a valid input, and to use the workers, when it's clearly an error.

HeapDumpDCmd::execute casts it to a unit so we lose the fact that it was a 
request for -1.
Can we get the signed value in HeapDumpDCmd::execute, print a message and 
return like we do when compression level is out of range?
(Passing a non-number when an integer is expected will cause the jcmd to fail, 
and that seems correct.)


---

So I think I'm done, looks good. 8-)

-

PR Comment: https://git.openjdk.org/jdk/pull/13667#issuecomment-1660469424


Re: RFR: JDK-8306441: Two phase segmented heap dump [v21]

2023-08-01 Thread Chris Plummer
On Tue, 1 Aug 2023 05:50:42 GMT, Yi Yang  wrote:

>> ### Motivation and proposal
>> Hi, heap dump brings about pauses for application's execution(STW), this is 
>> a well-known pain. JDK-8252842 have added parallel support to heapdump in an 
>> attempt to alleviate this issue. However, all concurrent threads 
>> competitively write heap data to the same file, and more memory is required 
>> to maintain the concurrent buffer queue. In experiments, we did not feel a 
>> significant performance improvement from that.
>> 
>> The minor-pause solution, which is presented in this PR, is a two-phase 
>> segmented heap dump:
>> 
>> - Phase 1(STW): Concurrent threads directly write data to multiple heap 
>> files.
>> - Phase 2(Non-STW): Merge multiple heap files into one complete heap dump 
>> file. This process can happen outside safepoint.
>> 
>> Now concurrent worker threads are not required to maintain a buffer queue, 
>> which would result in more memory overhead, nor do they need to compete for 
>> locks. The changes in the overall design are as follows:
>> 
>> ![image](https://github.com/openjdk/jdk/assets/5010047/77e4764a-62b5-4336-8b45-fc880ba14c4a)
>> Fig1. Before
>> 
>> ![image](https://github.com/openjdk/jdk/assets/5010047/931ab874-64d1-4337-ae32-3066eed809fc)
>> Fig2. After this patch
>> 
>> ### Performance evaluation
>> | memory | numOfThread | CompressionMode | STW | Total |
>> | ---| --- | --- | --- |  |
>> | 8g | 1 T | N | 15.612 | 15.612 |
>> | 8g | 32 T | N | 2.561725 | 14.498 |
>> | 8g | 32 T | C1 | 2.3084878 | 14.198 |
>> | 8g | 32 T | C2 | 10.9355128 | 21.882 |
>> | 8g | 96 T | N | 2.6790452 | 14.012 |
>> | 8g | 96 T | C1 | 2.3044796 | 3.589 |
>> | 8g | 96 T | C2 | 9.7585151 | 20.219 |
>> | 16g | 1 T | N | 26.278 | 26.278 |
>> | 16g | 32 T | N | 5.231374 | 26.417 |
>> | 16g | 32 T | C1 | 5.6946983 | 6.538 |
>> | 16g | 32 T | C2 | 21.8211105 | 41.133 |
>> | 16g | 96 T | N | 6.2445556 | 27.141 |
>> | 16g | 96 T | C1 | 4.6007096 | 6.259 |
>> | 16g | 96 T | C2 | 19.2965783 | 39.007 |
>> | 32g | 1 T | N | 48.149 | 48.149 |
>> | 32g | 32 T | N | 10.7734677 | 61.643 |
>> | 32g | 32 T | C1 | 10.1642097 | 10.903 |
>> | 32g | 32 T | C2 | 43.8407607 | 88.152 |
>> | 32g | 96 T | N | 13.1522042 | 61.432 |
>> | 32g | 96 T | C1 | 9.0954641 | 9.885 |
>> | 32g | 96 T | C2 | 38.9900931 | 80.574 |
>> | 64g | 1 T | N | 100.583 | 100.583 |
>> | 64g | 32 T | N | 20.9233744 | 134.701 |
>> | 64g | 32 T | C1 | 18.5023784 | 19.358 |
>> | 64g | 32 T | C2 | 86.4748377 | 172.707 |
>> | 64g | 96 T | N | 26.7374116 | 126.08 |
>> | 64g | ...
>
> Yi Yang has updated the pull request incrementally with one additional commit 
> since the last revision:
> 
>   test failure on mac

test/hotspot/jtreg/serviceability/HeapDump/IntegrityHeapDumpTest.java line 84:

> 82: .addToolArg(heapDumpFile.getAbsolutePath());
> 83: 
> 84: ProcessBuilder processBuilder = new 
> ProcessBuilder(launcher.getCommand());

Can you explain this change? This will result in the test not being run with 
sudo in cases where sudo is necessary. Before we get here we already checked if 
we need sudo (not running as root) and verified that passwordless sudo is 
supported. Otherwise the test is skipped and would never get here. So if sudo 
is required, the test will fail to attach to the debuggee. Note you might not 
see this issue if you have DevSecurityMode enabled. See 
[JDK-8313357](https://bugs.openjdk.org/browse/JDK-8313357)

-

PR Review Comment: https://git.openjdk.org/jdk/pull/13667#discussion_r1280948269


Re: RFR: JDK-8306441: Two phase segmented heap dump [v21]

2023-08-01 Thread Chris Plummer
On Tue, 1 Aug 2023 17:32:55 GMT, Chris Plummer  wrote:

>> Yi Yang has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   test failure on mac
>
> test/hotspot/jtreg/serviceability/HeapDump/IntegrityHeapDumpTest.java line 84:
> 
>> 82: .addToolArg(heapDumpFile.getAbsolutePath());
>> 83: 
>> 84: ProcessBuilder processBuilder = new 
>> ProcessBuilder(launcher.getCommand());
> 
> Can you explain this change? This will result in the test not being run with 
> sudo in cases where sudo is necessary. Before we get here we already checked 
> if we need sudo (not running as root) and verified that passwordless sudo is 
> supported. Otherwise the test is skipped and would never get here. So if sudo 
> is required, the test will fail to attach to the debuggee. Note you might not 
> see this issue if you have DevSecurityMode enabled. See 
> [JDK-8313357](https://bugs.openjdk.org/browse/JDK-8313357)

I see now that this is a new test, and not an SA test, so probably should never 
have been using SATestUtils in the first place (I'm curious as to how that 
ended up happening). You should also remove the import of SATestUtils.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/13667#discussion_r1280963213


Re: RFR: 8312623: SA add NestHost and NestMembers attributes when dumping class

2023-08-01 Thread Ashutosh Mehra
On Tue, 1 Aug 2023 08:33:24 GMT, David Holmes  wrote:

>> @dholmes-ora I confirmed there is no nest-host or nest-members attributes 
>> generated by this patch for a top level class which doesn't have any 
>> nest-members. Is that what you wanted to verify?
>
> @ashu-mehra That was one case. I also want to know that you have tested 
> deeply nested classes; and hidden classes (if applicable). Thanks.

@dholmes-ora I verified the case for hidden dynamically injected classes. The 
dumped class data for a hidden dynamically injected class does not have any 
Nest-Host attribute. When generating these classes dynamically the VM does not 
expose nest-host information in the class data, but sets the nest-host directly 
in the InstanceKlass.

Also verified the case for deeply nested classes by creating chain of nested 
classes as:

class DeepNest {
  class NestLvl1 {
class NestLvl2 {
  class NestLvl3 {
  }
}
  }
} 


Only `DeepNest` has the `NestMembers` attribute which lists all the 
NestLvl[1-3] classes. Rest all have `DeepNest` as the `NestHost` attribute.

Does this cover all the cases you flagged?

-

PR Comment: https://git.openjdk.org/jdk/pull/15005#issuecomment-1661077708


Re: RFR: 8312623: SA add NestHost and NestMembers attributes when dumping class

2023-08-01 Thread David Holmes
On Tue, 1 Aug 2023 20:53:45 GMT, Ashutosh Mehra  wrote:

>> @ashu-mehra That was one case. I also want to know that you have tested 
>> deeply nested classes; and hidden classes (if applicable). Thanks.
>
> @dholmes-ora I verified the case for hidden dynamically injected classes. The 
> dumped class data for a hidden dynamically injected class does not have any 
> Nest-Host attribute. When generating these classes dynamically the VM does 
> not expose nest-host information in the class data, but sets the nest-host 
> directly in the InstanceKlass.
> 
> Also verified the case for deeply nested classes by creating chain of nested 
> classes as:
> 
> class DeepNest {
>   class NestLvl1 {
> class NestLvl2 {
>   class NestLvl3 {
>   }
> }
>   }
> } 
> 
> 
> Only `DeepNest` has the `NestMembers` attribute which lists all the 
> NestLvl[1-3] classes. Rest all have `DeepNest` as the `NestHost` attribute.
> 
> Does this cover all the cases you flagged?

@ashu-mehra thanks for doing the additional testing. Pity there is no 
regression/functional test for this.

-

PR Comment: https://git.openjdk.org/jdk/pull/15005#issuecomment-1661502815