On Mon, 13 Nov 2023 12:51:36 GMT, Jorn Vernee <jver...@openjdk.org> wrote:
>> Add the ability to pass heap segments to native code. This requires using >> `Linker.Option.critical(true)` as a linker option. It has the same >> limitations as normal critical calls, namely: upcalls into Java are not >> allowed, and the native function should return relatively quickly. Heap >> segments are exposed to native code through temporary native addresses that >> are valid for the duration of the native call. >> >> The motivation for this is supporting existing Java array-based APIs that >> might have to pass multi-megabyte size arrays to native code, and are >> current relying on Get-/ReleasePrimitiveArrayCritical from JNI. Where making >> a copy of the array would be overly prohibitive. >> >> Components of this patch: >> >> - New binding operator `SegmentBase`, which gets the base object of a >> `MemorySegment`. >> - Rename `UnboxAddress` to `SegmentOffset`. Add flag to specify whether >> processing heap segments should be allowed. >> - `CallArranger` impls use new binding operators when >> `Linker.Option.critical(/* allowHeap= */ true)` is specified. >> - `NativeMethodHandle`/`NativeEntryPoint` allow `Object` in their signatures. >> - The object/oop + offset is exposed as temporary address to native code. >> - Since we stay in the `_thread_in_Java` state, we can safely expose the >> oops passed to the downcall stub to native code, without needing GCLocker. >> These oops are valid until we poll for safepoint, which we never do >> (invoking pure native code). >> - Only x64 and AArch64 for now. >> - I've refactored `ArgumentShuffle` in the C++ code to no longer rely on >> callbacks to get the set of source and destination registers (using >> `CallingConventionClosure`), but instead just rely on 2 equal size arrays >> with source and destination registers. This allows filtering the input java >> registers before passing them to `ArgumentShuffle`, which is required to >> filter out registers holding segment offsets. Replacing placeholder >> registers is also done as a separate pre-processing step now. See changes >> in: >> https://github.com/openjdk/jdk/pull/16201/commits/d2b40f1117d63cc6d74e377bf88cdcf6d15ff866 >> - I've factored out `DowncallStubGenerator` in the x64 and AArch64 code to >> use a common `DowncallLinker::StubGenerator`. >> - Fallback linker is also supported using JNI's >> `GetPrimitiveArrayCritical`/`ReleasePrimitiveArrayCritical` >> >> Aside: fixed existing issue with `DowncallLinker` not properly acquiring >> segments in interpreted mode. >> >> Numbers for the included benchmark on my machine are: >> >> >> Benchmar... > > Jorn Vernee has updated the pull request with a new target base due to a > merge or a rebase. The pull request now contains 52 commits: > > - Merge branch 'master' into AllowHeapNoLock > - fix type and reformat doc in Linker > - Merge branch 'master' into AllowHeapNoLock > - tweak whitespace > - a -> an > - add note to downcallHandle about passing heap segments by-reference > - Merge branch 'master' into AllowHeapNoLock > - bump up argument counts in TestLargeStub to their maximum > - s390 updates > - add stub size stress test for allowHeap > - ... and 42 more: https://git.openjdk.org/jdk/compare/03db8281...36da79d1 One additional comment on the pinning topic: We may even want to pin objects across several downcalls. One downcall could be used to initiate async I/O and other downcalls check the result. The buffer must be stable in the time between them: "The buffer area being written out must not be accessed during the operation or undefined results may occur. The memory areas involved must remain valid." https://man7.org/linux/man-pages/man3/aio_write.3.html (Not sure if we would use on heap memory for that.) ------------- PR Comment: https://git.openjdk.org/jdk/pull/16201#issuecomment-1810042909