On Wed, 15 Jan 2025 21:39:05 GMT, Matthias Ernst <d...@openjdk.org> wrote:

> Certain signatures for foreign function calls (e.g. HVA return by value) 
> require allocation of an intermediate buffer to adapt the FFM's to the native 
> stub's calling convention. In the current implementation, this buffer is 
> malloced and freed on every FFM invocation, a non-negligible overhead.
> 
> Sample stack trace:
> 
>    java.lang.Thread.State: RUNNABLE
>       at jdk.internal.misc.Unsafe.allocateMemory0(java.base@25-ea/Native 
> Method)
> ...
>       at 
> jdk.internal.foreign.abi.SharedUtils.newBoundedArena(java.base@25-ea/SharedUtils.java:386)
>       at 
> jdk.internal.foreign.abi.DowncallStub/0x000001f001084c00.invoke(java.base@25-ea/Unknown
>  Source)
> ...
>       at 
> java.lang.invoke.Invokers$Holder.invokeExact_MT(java.base@25-ea/Invokers$Holder)
> 
> 
> To alleviate this, this PR implements a per carrier-thread stacked allocator.
> 
> Performance (MBA M3):
> 
> 
> Before:
> Benchmark                    Mode  Cnt   Score   Error  Units
> CallOverheadByValue.byPtr    avgt   10   3.333 ? 0.152  ns/op
> CallOverheadByValue.byValue  avgt   10  33.892 ? 0.034  ns/op
> 
> After:
> Benchmark                    Mode  Cnt  Score   Error  Units
> CallOverheadByValue.byPtr    avgt   30  3.311 ? 0.034  ns/op
> CallOverheadByValue.byValue  avgt   30  6.143 ? 0.053  ns/op
> 
> 
> `-prof gc` also shows that the new call path is fully scalar-replaced vs 160 
> byte/call before.

This pull request has now been integrated.

Changeset: 8cc13045
Author:    Matthias Ernst <mernst-git...@mernst.org>
Committer: Jorn Vernee <jver...@openjdk.org>
URL:       
https://git.openjdk.org/jdk/commit/8cc13045428eebb8933df865f9a87f0f91909ba5
Stats:     488 lines in 7 files changed: 468 ins; 14 del; 6 mod

8287788: Implement a better allocator for downcalls

Reviewed-by: jvernee

-------------

PR: https://git.openjdk.org/jdk/pull/23142

Reply via email to