On Tue, 16 Jul 2024 12:36:08 GMT, Roman Kennke <rken...@openjdk.org> wrote:
>> Axel Boldt-Christmas has updated the pull request incrementally with 10 >> additional commits since the last revision: >> >> - Remove try_read >> - Add explicit to single parameter constructors >> - Remove superfluous access specifier >> - Remove unused include >> - Update assert message OMCache::set_monitor >> - Fix indentation >> - Remove outdated comment LightweightSynchronizer::exit >> - Remove logStream include >> - Remove strange comment >> - Fix javaThread include > > src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 674: > >> 672: >> 673: // Search for obj in cache. >> 674: bind(loop); > > Same loop transformation would be possible here. I tried the following (see diff below) and it shows about a 5-10% regression in most the `LockUnlock.testInflated*` micros. Also tried with just `num_unrolled = 1` saw the same regression. Maybe there was some other pattern you were thinking of. There are probably architecture and platform differences. This can and should probably be explored in a followup PR. diff --git a/src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp b/src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp index 5dbfdbc225d..4e6621cfece 100644 --- a/src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp +++ b/src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp @@ -663,25 +663,28 @@ void C2_MacroAssembler::fast_lock_lightweight(Register obj, Register box, Regist const int num_unrolled = 2; for (int i = 0; i < num_unrolled; i++) { - cmpptr(obj, Address(t)); - jccb(Assembler::equal, monitor_found); - increment(t, in_bytes(OMCache::oop_to_oop_difference())); + Label next; + cmpptr(obj, Address(t, OMCache::oop_to_oop_difference() * i)); + jccb(Assembler::notEqual, next); + increment(t, in_bytes(OMCache::oop_to_oop_difference() * i)); + jmpb(monitor_found); + bind(next); } + increment(t, in_bytes(OMCache::oop_to_oop_difference() * (num_unrolled - 1))); Label loop; // Search for obj in cache. bind(loop); - - // Check for match. - cmpptr(obj, Address(t)); - jccb(Assembler::equal, monitor_found); - + // Advance. + increment(t, in_bytes(OMCache::oop_to_oop_difference())); // Search until null encountered, guaranteed _null_sentinel at end. cmpptr(Address(t), 1); jcc(Assembler::below, slow_path); // 0 check, but with ZF=0 when *t == 0 - increment(t, in_bytes(OMCache::oop_to_oop_difference())); - jmpb(loop); + + // Check for match. + cmpptr(obj, Address(t)); + jccb(Assembler::notEqual, loop); // Cache hit. bind(monitor_found); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1715249312