On Thu, 23 Mar 2023 08:20:18 GMT, Johannes Bechberger <jbechber...@openjdk.org> 
wrote:

>> Fixes the issue by transitioning the thread into the WXWrite mode while 
>> walking the stack in AsyncGetCallTrace.
>> 
>> Tested on my M1 mac.
>
> Johannes Bechberger has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Use raw_thread instead of Thread::current()

I tested the execution time of individual ASGCT calls on the renaissance 
benchmark using 
[asgct_perf_test](https://github.com/parttimenerd/asgct_perf_test) which I 
created for this purpose.

TLDR: No relevant performance difference when disabling cache modification for 
ASGCT via a field in `Thread`.

On Linux (64 core machine), all timings are in microseconds. With the disabled 
caches we get:

bucket         %       count      min     mean      max      std std/mean   
median     90th     99th
overall      100    81265409     0.04     4.07 59334.88    32.23     7.93     
2.62     7.14    23.20

vs with current head OpenJDK:

overall      100    81607301     0.03     3.96 83194.84    30.49     7.69     
2.57     7.00    22.27

The difference of the averages just 110ns which should be undetectable in real 
life.

With the disabled caches on Mac M1 we get:

overall      100    39281484     0.00     1.39 94833.66    36.48    26.21     
0.92     2.08     9.46

vs with the current JDK and [the fix adapted from 
async-profiler](https://github.com/async-profiler/async-profiler/blob/e3b7bfca227ae5c916f00abfacf0e61291df3a67/src/profiler.cpp#L383)
 which sets the WX mode:

overall      100    37725188     0.00     1.40110634.45    29.39    20.92     
0.92     2.17    10.21

Which doesn't alter the performance characteristics.

I'm therefore pushing my new change to this PR, which prevents ASGCT from 
modifying any PcDesc cache.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/13144#issuecomment-1482570550

Reply via email to