> This patch expands the use of a hash table for secondary superclasses > to the interpreter, C1, and runtime. It also adds a C2 implementation > of hashed lookup in cases where the superclass isn't known at compile > time. > > HotSpot shared runtime > ---------------------- > > Building hashed secondary tables is now unconditional. It takes very > little time, and now that the shared runtime always has the tables, it > might as well take advantage of them. The shared code is easier to > follow now, I think. > > There might be a performance issue with x86-64 in that we build > HotSpot for a default x86-64 target that does not support popcount. > This means that HotSpot C++ runtime on x86 always uses a software > emulation for popcount, even though the vast majority of machines made > for the past 20 years can do popcount in a single instruction. It > wouldn't be terribly hard to do something about that. > > Having said that, the software popcount is really not bad. > > x86 > --- > > x86 is rather tricky, because we still support > `-XX:-UseSecondarySupersTable` and `-XX:+UseSecondarySupersCache`, as > well as 32- and 64-bit ports. There's some further complication in > that only `RCX` can be used as a shift count, so there's some register > shuffling to do. All of this makes the logic in macroAssembler_x86.cpp > rather gnarly, with multiple levels of conditionals at compile time > and runtime. > > AArch64 > ------- > > AArch64 is considerably more straightforward. We always have a > popcount instruction and (thankfully) no 32-bit code to worry about. > > Generally > --------- > > I would dearly love simply to rip out the "old" secondary supers cache > support, but I've left it in just in case someone has a performance > regression. > > The versions of `MacroAssembler::lookup_secondary_supers_table` that > work with variable superclasses don't take a fixed set of temp > registers, and neither do they call out to to a slow path subroutine. > Instead, the slow patch is expanded inline. > > I don't think this is necessarily bad. Apart from the very rare cases > where C2 can't determine the superclass to search for at compile time, > this code is only used for generating stubs, and it seemed to me > ridiculous to have stubs calling other stubs. > > I've followed the guidance from @iwanowww not to obsess too much about > the performance of C1-compiled secondary supers lookups, and to prefer > simplicity over absolute performance. Nonetheless, this is a > complicated patch that touches many areas.
Andrew Haley has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 40 commits: - Merge branch 'clean' into JDK-8331658-work - Cleanup - temp - Merge branch 'JDK-8331658-work' of https://github.com/theRealAph/jdk into JDK-8331658-work - Review comments - Review comments - Review comments - Review comments - Review feedback - Review feedback - ... and 30 more: https://git.openjdk.org/jdk/compare/34ee06f5...248f44dc ------------- Changes: https://git.openjdk.org/jdk/pull/19989/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=19989&range=07 Stats: 991 lines in 19 files changed: 754 ins; 116 del; 121 mod Patch: https://git.openjdk.org/jdk/pull/19989.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19989/head:pull/19989 PR: https://git.openjdk.org/jdk/pull/19989