> This optimization is a followup to https://github.com/openjdk/jdk/pull/24290 
> trying to reduce the performance regression in some scenarios introduced in 
> https://bugs.openjdk.org/browse/JDK-8292818 . Based both on performance and 
> memory consumption it is a (better) alternative to 
> https://github.com/openjdk/jdk/pull/24713 .
> 
> This PR optimizes local field lookup in classes with more than 16 fields; 
> rather than sequentially iterating through all fields during lookup we sort 
> the fields based on the field name. The stream includes extra table after the 
> field information: for field at position 16, 32 ... we record the 
> (variable-length-encoded) offset of the field info in this stream. On field 
> lookup, rather than iterating through all fields, we iterate through this 
> table, resolve names for given fields and continue field-by-field iteration 
> only after the last record (hence at most 16 fields).
> 
> In classes with <= 16 fields this PR reduces the memory consumption by 1 byte 
> that was left with value 0 at the end of stream. In classes with > 16 fields 
> we add extra 4 bytes with offset of the table, and the table contains one 
> varint for each 16 fields. The terminal byte is not used either.
> 
> My measurements on the attached reproducer
> 
> hyperfine -w 50 -r 100 '/path/to/jdk-17/bin/java -cp /tmp CCC'
> Benchmark 1: /path/to/jdk-17/bin/java -cp /tmp CCC
>   Time (mean ± σ):      51.3 ms ±   2.8 ms    [User: 44.7 ms, System: 13.7 ms]
>   Range (min … max):    45.1 ms …  53.9 ms    100 runs
> 
> hyperfine -w 50 -r 100 '/path/to/jdk25-master/bin/java -cp /tmp CCC'
> Benchmark 1: /path/to/jdk25-master/bin/java -cp /tmp CCC
>   Time (mean ± σ):      78.2 ms ±   1.0 ms    [User: 74.6 ms, System: 17.3 ms]
>   Range (min … max):    73.8 ms …  79.7 ms    100 runs
> 
> (the jdk25-master above already contains JDK-8353175)
> 
> hyperfine -w 50 -r 100 '/path/to/jdk25-this-pr/bin/java -cp /tmp CCC'
> Benchmark 1: /path/to/jdk25-this-pr/jdk/bin/java -cp /tmp CCC
>   Time (mean ± σ):      38.5 ms ±   0.5 ms    [User: 34.4 ms, System: 17.3 ms]
>   Range (min … max):    37.7 ms …  42.1 ms    100 runs
> 
> While https://github.com/openjdk/jdk/pull/24713 returned the performance to 
> previous levels, this PR improves it by 25% compared to JDK 17 (which does 
> not contain the regression)! This time, the undisclosed production-grade 
> reproducer shows even higher improvement:
> 
> JDK 17: 1.6 s
> JDK 21 (no patches): 22 s
> JDK25-master: 12.3 s
> JDK25-this-pr: 0.5 s

Radim Vansa has refreshed the contents of this pull request, and previous 
commits have been removed. The incremental views will show differences compared 
to the previous content of the PR. The pull request contains two new commits 
since the last revision:

 - Add type cast
 - Fix static_assert

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/24847/files
  - new: https://git.openjdk.org/jdk/pull/24847/files/9cba2d4a..c592ea59

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=16
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24847&range=15-16

  Stats: 53 lines in 4 files changed: 0 ins; 47 del; 6 mod
  Patch: https://git.openjdk.org/jdk/pull/24847.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24847/head:pull/24847

PR: https://git.openjdk.org/jdk/pull/24847

Reply via email to