On Fri, 17 Jan 2025 14:58:37 GMT, Jorn Vernee <jver...@openjdk.org> wrote:

> Could you add the benchmark you're using to the PR as well? 

Done. I slotted it into the "points" BM suite, alas I had to define another 
"DoublePoint" struct, though, since the existing int/int pair gets packed into 
a long.

Full disclosure, I'm not sure how to run it inside the jdk build structure, ran 
it outside instead, so I hope it builds (`make test 
TEST="micro:java.lang.foreign.points"` => `Error: Unable to access jarfile 
/Users/mernst/IdeaProjects/jdk/build/macosx-aarch64-server-fastdebug/images/test/micro/benchmarks.jar`)


It exercises a loop like this:

struct DoublePoint { double x; double y; }
DoublePoint unit_rotate(double phi);  <== HFA requires intermediate buffer
void unit_rotate_ptr(DoublePoint* out, double phi);  <== reference, no 
intermediate buffer

DoublePoint *points = new DoublePoint[N];
for (i in 0...N) points[i] = unit_rotate(2*pi*i/N);
vs
for (i in 0...N) unit_rotate_ptr(points+i, 2*pi*i/N);


It is now almost competitive and the memory profile looks a lot better:

# VM version: JDK 25-ea, OpenJDK 64-Bit Server VM, 25-ea+3-283
Benchmark                                        Mode  Cnt      Score      
Error   Units
PointsAlloc.circle_by_ptr                        avgt    5      8.964 ±   0.351 
  ns/op
PointsAlloc.circle_by_ptr:·gc.alloc.rate         avgt    5     95.301 ±   3.665 
 MB/sec
PointsAlloc.circle_by_ptr:·gc.alloc.rate.norm    avgt    5      0.224 ±   0.001 
   B/op
PointsAlloc.circle_by_ptr:·gc.count              avgt    5      2.000           
 counts
PointsAlloc.circle_by_ptr:·gc.time               avgt    5      3.000           
     ms
PointsAlloc.circle_by_value                      avgt    5     46.498 ±   2.336 
  ns/op
PointsAlloc.circle_by_value:·gc.alloc.rate       avgt    5  13141.578 ± 650.425 
 MB/sec
PointsAlloc.circle_by_value:·gc.alloc.rate.norm  avgt    5    160.224 ±   0.001 
   B/op
PointsAlloc.circle_by_value:·gc.count            avgt    5    116.000           
 counts
PointsAlloc.circle_by_value:·gc.time             avgt    5     44.000           
     ms

# VM version: JDK 25-internal, OpenJDK 64-Bit Server VM, 
25-internal-adhoc.mernst.jdk
Benchmark                                        Mode  Cnt   Score    Error   
Units
PointsAlloc.circle_by_ptr                        avgt    5   9.108 ±  0.477   
ns/op
PointsAlloc.circle_by_ptr:·gc.alloc.rate         avgt    5  93.792 ±  4.898  
MB/sec
PointsAlloc.circle_by_ptr:·gc.alloc.rate.norm    avgt    5   0.224 ±  0.001    
B/op
PointsAlloc.circle_by_ptr:·gc.count              avgt    5   2.000           
counts
PointsAlloc.circle_by_ptr:·gc.time               avgt    5   4.000              
 ms
PointsAlloc.circle_by_value                      avgt    5  13.180 ±  0.611   
ns/op
PointsAlloc.circle_by_value:·gc.alloc.rate       avgt    5  64.816 ±  2.964  
MB/sec
PointsAlloc.circle_by_value:·gc.alloc.rate.norm  avgt    5   0.224 ±  0.001    
B/op
PointsAlloc.circle_by_value:·gc.count            avgt    5   2.000           
counts
PointsAlloc.circle_by_value:·gc.time             avgt    5   5.000              
 ms

-------------

PR Comment: https://git.openjdk.org/jdk/pull/23142#issuecomment-2599586149

Reply via email to