On Wed, 2 Aug 2023 11:33:17 GMT, Yi Yang <yy...@openjdk.org> wrote:

>> ### Motivation and proposal
>> Hi, heap dump brings about pauses for application's execution(STW), this is 
>> a well-known pain. JDK-8252842 have added parallel support to heapdump in an 
>> attempt to alleviate this issue. However, all concurrent threads 
>> competitively write heap data to the same file, and more memory is required 
>> to maintain the concurrent buffer queue. In experiments, we did not feel a 
>> significant performance improvement from that.
>> 
>> The minor-pause solution, which is presented in this PR, is a two-phase 
>> segmented heap dump:
>> 
>> - Phase 1(STW): Concurrent threads directly write data to multiple heap 
>> files.
>> - Phase 2(Non-STW): Merge multiple heap files into one complete heap dump 
>> file. This process can happen outside safepoint.
>> 
>> Now concurrent worker threads are not required to maintain a buffer queue, 
>> which would result in more memory overhead, nor do they need to compete for 
>> locks. The changes in the overall design are as follows:
>> 
>> ![image](https://github.com/openjdk/jdk/assets/5010047/77e4764a-62b5-4336-8b45-fc880ba14c4a)
>> <p align="center">Fig1. Before</p>
>> 
>> ![image](https://github.com/openjdk/jdk/assets/5010047/931ab874-64d1-4337-ae32-3066eed809fc)
>> <p align="center">Fig2. After this patch</p>
>> 
>> ### Performance evaluation
>> | memory | numOfThread | CompressionMode | STW | Total |
>> | -------| ----------- | --------------- | --- | ---- |
>> | 8g | 1 T | N | 15.612 | 15.612 |
>> | 8g | 32 T | N | 2.561725 | 14.498 |
>> | 8g | 32 T | C1 | 2.3084878 | 14.198 |
>> | 8g | 32 T | C2 | 10.9355128 | 21.882 |
>> | 8g | 96 T | N | 2.6790452 | 14.012 |
>> | 8g | 96 T | C1 | 2.3044796 | 3.589 |
>> | 8g | 96 T | C2 | 9.7585151 | 20.219 |
>> | 16g | 1 T | N | 26.278 | 26.278 |
>> | 16g | 32 T | N | 5.231374 | 26.417 |
>> | 16g | 32 T | C1 | 5.6946983 | 6.538 |
>> | 16g | 32 T | C2 | 21.8211105 | 41.133 |
>> | 16g | 96 T | N | 6.2445556 | 27.141 |
>> | 16g | 96 T | C1 | 4.6007096 | 6.259 |
>> | 16g | 96 T | C2 | 19.2965783 | 39.007 |
>> | 32g | 1 T | N | 48.149 | 48.149 |
>> | 32g | 32 T | N | 10.7734677 | 61.643 |
>> | 32g | 32 T | C1 | 10.1642097 | 10.903 |
>> | 32g | 32 T | C2 | 43.8407607 | 88.152 |
>> | 32g | 96 T | N | 13.1522042 | 61.432 |
>> | 32g | 96 T | C1 | 9.0954641 | 9.885 |
>> | 32g | 96 T | C2 | 38.9900931 | 80.574 |
>> | 64g | 1 T | N | 100.583 | 100.583 |
>> | 64g | 32 T | N | 20.9233744 | 134.701 |
>> | 64g | 32 T | C1 | 18.5023784 | 19.358 |
>> | 64g | 32 T | C2 | 86.4748377 | 172.707 |
>> | 64g | 96 T | N | 26.7374116 | 126.08 |
>> | 64g | ...
>
> Yi Yang has updated the pull request incrementally with one additional commit 
> since the last revision:
> 
>   fix typo Reqeusted to Request

OK great.  
I think -parallel=0 could be intentional, meaning I don't want parallelism, I 
don't want parallel dump threads, that's a reasonable thing for a user to expect
(but parallel=-1 is clearly an error).  Can we just relax the constraint on 
that setting:

    if (parallel < 0) {
      output()->print_cr("Invalid number of parallel dump threads.");
      return;
          
test/hotspot/jtreg/serviceability/dcmd/gc/HeapDumpTest.java
will need an adjustment where it checks parallelNum and checks the error text

-------------------------------------

Do we really need IntegrityHeapDumpTest.java ?
I don't see that we do.

It runs a LingeredApp, runs "jcmd PID GC.heap_dump -parallel=NUMPROCESSORS 
FILENAME" and runs verifyHeapDump(heapDumpFile);


The existing test which you have enhanced:
test/hotspot/jtreg/serviceability/dcmd/gc/HeapDumpTest.java:

Uses PidJcmdExecutor() will attach to the current process.  It's different, but 
we don't need to test that jcmd can attach to a LingeredApp for this change, 
it's the processing of whatever jcmd command text that's important.

The verification/integrity checking method is the same.

The new pidParallel() method tests a variety of parallel heap dumps, which is 
good (more variation than IntegrityHeapDumpTest).

If HeapDumpTest.java becomes "@run testng/othervm -Xlog:heapdump HeapDumpTest" 
then we can check for the parallel heap dump logging, which is maybe the reason 
for having IntegrityHeapDumpTest:

            // Expect parallel heap dump
            if (Runtime.getRuntime().availableProcessors() > 1) {
                String output = theApp.getProcessStdout();
                Asserts.assertTrue(output.contains("Dump heap objects in 
parallel"));
                Asserts.assertTrue(output.contains("Merge heap files 
complete"));
            }


A problem is that it always now passes the -parallel=X option, but we should 
still have tests that do NOT pass this option, to ensure it's not required.
This probably means overloading the run method?  The run method that doesn't 
take an "int paralleNum" needs to not pass the -parallel= option.

This did not occur to me earlier, but it would be great to have this Heap Dump 
testing in one place.

-------------------------------

-------------

PR Comment: https://git.openjdk.org/jdk/pull/13667#issuecomment-1662090436

Reply via email to