On Wed, 30 Oct 2024 18:43:45 GMT, Maurizio Cimadamore <mcimadam...@openjdk.org> 
wrote:

>> This PR proposes to improve `MemorySegment::ofBuffer` making it more 
>> amenable to inlining and generally improving performance.
>> 
>> Testing successfully on tier1-3
>
> It would be great if we could find a benchmark where inlining doesn't happen 
> and that causes escape analysis issue. @Spasi do you have anything on that 
> front?

Hey @mcimadamore,

Yes, a simple modification of the original benchmark demonstrates it nicely:


import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.infra.*;

import java.lang.foreign.*;
import java.nio.*;

@State(Scope.Benchmark)
public class FFMOfBufferTest {

    private ByteBuffer buffer = ByteBuffer
        .allocateDirect(0x1000)
        .order(ByteOrder.nativeOrder());

    @Benchmark
    public void ofBuffer(Blackhole bh) {
        bh.consume(MemorySegment.ofBuffer(buffer).address());
    }

    @Benchmark
    @Fork(jvmArgsAppend = 
"-XX:CompileCommand=inline,jdk.internal.foreign.AbstractMemorySegmentImpl::ofBuffer,false")
    public void ofBufferInlineFalse(Blackhole bh) {
        bh.consume(MemorySegment.ofBuffer(buffer).address());
    }

    @Benchmark
    @Fork(jvmArgsAppend = 
"-XX:CompileCommand=inline,jdk.internal.foreign.AbstractMemorySegmentImpl::ofBuffer,true")
    public void ofBufferInlineTrue(Blackhole bh) {
        bh.consume(MemorySegment.ofBuffer(buffer).address());
    }

}


Now the `.address()` is consumed instead of the buffer itself. Results on 
`24-beta+20`:


Benchmark                                               Mode  Cnt      Score    
  Error   Units
FFMOfBufferTest.ofBuffer                                avgt    3      6,542 ±  
  3,476   ns/op
FFMOfBufferTest.ofBuffer:gc.alloc.rate                  avgt    3  10499,839 ± 
5496,416  MB/sec
FFMOfBufferTest.ofBuffer:gc.alloc.rate.norm             avgt    3     72,000 ±  
  0,001    B/op
FFMOfBufferTest.ofBuffer:gc.count                       avgt    3     17,000    
         counts
FFMOfBufferTest.ofBuffer:gc.time                        avgt    3     15,000    
             ms
FFMOfBufferTest.ofBufferInlineFalse                     avgt    3      6,437 ±  
  1,909   ns/op
FFMOfBufferTest.ofBufferInlineFalse:gc.alloc.rate       avgt    3  10666,676 ± 
3133,711  MB/sec
FFMOfBufferTest.ofBufferInlineFalse:gc.alloc.rate.norm  avgt    3     72,000 ±  
  0,001    B/op
FFMOfBufferTest.ofBufferInlineFalse:gc.count            avgt    3     17,000    
         counts
FFMOfBufferTest.ofBufferInlineFalse:gc.time             avgt    3     14,000    
             ms
FFMOfBufferTest.ofBufferInlineTrue                      avgt    3      0,882 ±  
  0,507   ns/op
FFMOfBufferTest.ofBufferInlineTrue:gc.alloc.rate        avgt    3      2,157 ±  
 67,949  MB/sec
FFMOfBufferTest.ofBufferInlineTrue:gc.alloc.rate.norm   avgt    3      0,002 ±  
  0,061    B/op
FFMOfBufferTest.ofBufferInlineTrue:gc.count             avgt    3        ≈ 0    
         counts

-------------

PR Comment: https://git.openjdk.org/jdk/pull/21764#issuecomment-2448325164

Reply via email to