On Tue, 27 Aug 2024 10:38:46 GMT, Per Minborg <pminb...@openjdk.org> wrote:
>> The performance of the `MemorySegment::fil` can be improved by replacing the >> `checkAccess()` method call with calling `checkReadOnly()` instead (as the >> bounds of the segment itself do not need to be checked). >> >> Also, smaller segments can be handled directly by Java code rather than >> transitioning to native code. >> >> Here is how the `MemorySegment::fill` performance is improved by this PR: >> >>  >> >> Operations involving 8 or more bytes are delegated to native code whereas >> smaller segments are handled via a switch rake. >> >> It should be noted that `Arena::allocate` is using `MemorySegment::fil`. >> Hence, this PR will also have a positive effect on memory allocation >> performance. > > Per Minborg has updated the pull request with a new target base due to a > merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains six additional > commits since the last revision: > > - Merge branch 'master' into fill-performance > - Fix typo > - Add a comment about the old switch type > - Remove unused import > - Reduce kick-in size and add test > - Initial implementation How fast do we need to be here given we are measuring in a few nanoseconds per operation? What if the goal is not to regress from say explicitly filling in a small sized segment or a comparable array (e.g., < 8 bytes) then maybe a loop suffices and the code is simple? ------------- PR Comment: https://git.openjdk.org/jdk/pull/20712#issuecomment-2313446118