On Mon, 16 Sep 2024 13:28:00 GMT, Roman Kennke <rken...@openjdk.org> wrote:

>> This is the main body of the JEP 450: Compact Object Headers (Experimental).
>> 
>> It is also a follow-up to #20640, which now also includes (and supersedes) 
>> #20603 and #20605, plus the Tiny Class-Pointers parts that have been 
>> previously missing.
>> 
>> Main changes:
>>  - Introduction of the (experimental) flag UseCompactObjectHeaders. All 
>> changes in this PR are protected by this flag. The purpose of the flag is to 
>> provide a fallback, in case that users unexpectedly observe problems with 
>> the new implementation. The intention is that this flag will remain 
>> experimental and opt-in for at least one release, then make it on-by-default 
>> and diagnostic (?), and eventually deprecate and obsolete it. However, there 
>> are a few unknowns in that plan, specifically, we may want to further 
>> improve compact headers to 4 bytes, we are planning to enhance the Klass* 
>> encoding to support virtually unlimited number of Klasses, at which point we 
>> could also obsolete UseCompressedClassPointers.
>>  - The compressed Klass* can now be stored in the mark-word of objects. In 
>> order to be able to do this, we are add some changes to GC forwarding (see 
>> below) to protect the relevant (upper 22) bits of the mark-word. Significant 
>> parts of this PR deal with loading the compressed Klass* from the mark-word. 
>> This PR also changes some code paths (mostly in GCs) to be more careful when 
>> accessing Klass* (or mark-word or size) to be able to fetch it from the 
>> forwardee in case the object is forwarded.
>>  - Self-forwarding in GCs (which is used to deal with promotion failure) now 
>> uses a bit to indicate 'self-forwarding'. This is needed to preserve the 
>> crucial Klass* bits in the header. This also allows to get rid of 
>> preserved-header machinery in SerialGC and G1 (Parallel GC abuses 
>> preserved-marks to also find all other relevant oops).
>>  - Full GC forwarding now uses an encoding similar to compressed-oops. We 
>> have 40 bits for that, and can encode up to 8TB of heap. When exceeding 8TB, 
>> we turn off UseCompressedClassPointers (except in ZGC, which doesn't use the 
>> GC forwarding at all).
>>  - Instances can now have their base-offset (the offset where the field 
>> layouter starts to place fields) at offset 8 (instead of 12 or 16).
>>  - Arrays will now store their length at offset 8.
>>  - CDS can now write and read archives with the compressed header. However, 
>> it is not possible to read an archive that has been written with an opposite 
>> setting of UseCompactObjectHeaders. Some build machinery is added so that 
>> _co...
>
> Roman Kennke has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains 54 commits:
> 
>  - Merge remote-tracking branch 'origin/master' into JDK-8305895-v4
>  - Fix  
> test/hotspot/jtreg/runtime/CompressedOops/CompressedClassPointersEncodingScheme.java
>  - Fix loop on aarch64
>  - clarify obscure assert in metasapce setup
>  - Rework compressedklass encoding
>  - remove stray debug output
>  - Fixes post 8338526
>  - Merge commit '597788850041e7272a23714c05ba546ee6080856' into JDK-8305895-v4
>  - Various touch-ups
>  - Hide log timestamps in test to prevent false failures
>  - ... and 44 more: https://git.openjdk.org/jdk/compare/54595188...2125cd81

src/hotspot/cpu/aarch64/aarch64.ad line 6459:

> 6457:   format %{ "ldrw  $dst, $mem\t# compressed class ptr" %}
> 6458:   ins_encode %{
> 6459:     __ load_nklass_compact_c2($dst$$Register, $mem$$base$$Register, 
> $mem$$index$$Register, $mem$$scale, $mem$$disp);

I wonder if something along the line of this is required here.
Suggestion:

    Address addr = mem2address($mem->opcode(), $mem$$base$$Register, 
$mem$$index, $mem$$scale, $mem$$disp);
    __ load_nklass_compact_c2($dst$$Register, __ 
adjust_compact_object_header_address_c2(addr, rscratch1));

With `adjust_compact_object_header_address_c2` being:
```C++
Address C2_MacroAssembler::adjust_compact_object_header_address_c2(Address 
addr, Register tmp) {
  // The incoming address is pointing into obj-start + klass_offset_in_bytes. 
We need to extract
  // obj-start, so that we can load from the object's mark-word instead. 
Usually the address
  // comes as obj-start in addr.base() and klass_offset_in_bytes in 
addr.offset().
  if (addr.getMode() != Address::base_plus_offset) {
    lea(tmp, addr);
    addr = Address(tmp, -oopDesc::klass_offset_in_bytes());
  } else {
    addr = Address(addr.base(), addr.offset() - 
oopDesc::klass_offset_in_bytes());
  }
  return legitimize_address(addr, 8, tmp);
}


Maybe it is the case that we never get the case where `$mem->opcode()` is not 
`lsl` variant, nor that the offset is to far away for an immediate fixed by 
`legitimize_address`. But it seems like this would at least make those cases 
correct, while avoiding the `lea` in the common case. 

Maybe someone with better experience in aarch64 macroassembler+ad files and C2 
can give an opinion.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/20677#discussion_r1761455581

Reply via email to