Going forward, converting older JDK code to use the relatively new FFM API requires system calls that can provide `errno` and the likes to explicitly allocate a `MemorySegment` to capture potential error states. This can lead to negative performance implications if not designed carefully and also introduces unnecessary code complexity.
Hence, this PR proposes to add a JDK internal method handle adapter that can be used to handle system calls with `errno`, `GetLastError`, and `WSAGetLastError`. It relies on an efficient carrier-thread-local cache of memory regions to allide allocations. Here are some benchmarks that ran on a platform thread and virtual threads respectively: Benchmark Mode Cnt Score Error Units CaptureStateUtilBench.OfVirtual.adaptedSysCallFail avgt 30 24.193 ? 0.268 ns/op CaptureStateUtilBench.OfVirtual.adaptedSysCallSuccess avgt 30 8.268 ? 0.080 ns/op CaptureStateUtilBench.OfVirtual.explicitAllocationFail avgt 30 42.076 ? 1.003 ns/op CaptureStateUtilBench.OfVirtual.explicitAllocationSuccess avgt 30 21.801 ? 0.138 ns/op CaptureStateUtilBench.OfVirtual.tlAllocationFail avgt 30 23.265 ? 0.087 ns/op CaptureStateUtilBench.OfVirtual.tlAllocationSuccess avgt 30 8.285 ? 0.155 ns/op CaptureStateUtilBench.adaptedSysCallFail avgt 30 23.033 ? 0.423 ns/op CaptureStateUtilBench.adaptedSysCallSuccess avgt 30 3.676 ? 0.104 ns/op // <- Happy path using an internal pool CaptureStateUtilBench.explicitAllocationFail avgt 30 42.023 ? 0.736 ns/op CaptureStateUtilBench.explicitAllocationSuccess avgt 30 22.013 ? 0.648 ns/op // <- Allocating memory upon each invocation CaptureStateUtilBench.tlAllocationFail avgt 30 22.050 ? 0.233 ns/op CaptureStateUtilBench.tlAllocationSuccess avgt 30 3.756 ? 0.056 ns/op // <- Using the pool explicitly from Java code Adapted system call: return (int) ADAPTED_HANDLE.invoke(0, 0); // Uses a MH-internal pool Explicit allocation: try (var arena = Arena.ofConfined()) { return (int) HANDLE.invoke(arena.allocate(4), 0, 0); } Thread Local allocation: try (var arena = POOLS.take()) { return (int) HANDLE.invoke(arena.allocate(4), 0, 0); // Uses a manually specified pool } The adapted system call exhibits a ~6x performance improvement over the existing "explicit allocation" scheme for the happy path on platform threads. Because there needs to be sharing across threads for virtual-tread-capable carrier threads, these are a bit slower ("only" ~2.5x faster). Tested and passed tiers 1-3. ------------- Commit messages: - Bump copyright year - Add benchmarks - Add method handle adapter for system calls Changes: https://git.openjdk.org/jdk/pull/23517/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=23517&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8347408 Stats: 1381 lines in 11 files changed: 1370 ins; 2 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/23517.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/23517/head:pull/23517 PR: https://git.openjdk.org/jdk/pull/23517