On Wed, 15 Jan 2025 21:39:05 GMT, Matthias Ernst <d...@openjdk.org> wrote:
> Certain signatures for foreign function calls (e.g. HVA return by value) > require allocation of an intermediate buffer to adapt the FFM's to the native > stub's calling convention. In the current implementation, this buffer is > malloced and freed on every FFM invocation, a non-negligible overhead. > > Sample stack trace: > > java.lang.Thread.State: RUNNABLE > at jdk.internal.misc.Unsafe.allocateMemory0(java.base@25-ea/Native > Method) > ... > at > jdk.internal.foreign.abi.SharedUtils.newBoundedArena(java.base@25-ea/SharedUtils.java:386) > at > jdk.internal.foreign.abi.DowncallStub/0x000001f001084c00.invoke(java.base@25-ea/Unknown > Source) > ... > at > java.lang.invoke.Invokers$Holder.invokeExact_MT(java.base@25-ea/Invokers$Holder) > > > To alleviate this, this PR implements a per carrier-thread stacked allocator. > > Performance (MBA M3): > > > Before: > Benchmark Mode Cnt Score Error Units > CallOverheadByValue.byPtr avgt 10 3.333 ? 0.152 ns/op > CallOverheadByValue.byValue avgt 10 33.892 ? 0.034 ns/op > > After: > Benchmark Mode Cnt Score Error Units > CallOverheadByValue.byPtr avgt 30 3.311 ? 0.034 ns/op > CallOverheadByValue.byValue avgt 30 6.143 ? 0.053 ns/op > > > `-prof gc` also shows that the new call path is fully scalar-replaced vs 160 > byte/call before. This pull request has now been integrated. Changeset: 8cc13045 Author: Matthias Ernst <mernst-git...@mernst.org> Committer: Jorn Vernee <jver...@openjdk.org> URL: https://git.openjdk.org/jdk/commit/8cc13045428eebb8933df865f9a87f0f91909ba5 Stats: 488 lines in 7 files changed: 468 ins; 14 del; 6 mod 8287788: Implement a better allocator for downcalls Reviewed-by: jvernee ------------- PR: https://git.openjdk.org/jdk/pull/23142