On Thu, 26 Oct 2023 09:17:25 GMT, Per Minborg <pminb...@openjdk.org> wrote:
>> This PR proposes removing the restriction that only heap `MemorySegment` >> wrapping a `byte` array can be accessed by Vectors. Now any array type can >> be used provided the element alignment constraints are respected. > > Per Minborg has updated the pull request incrementally with one additional > commit since the last revision: > > Allow unaligned array access Here is a test showing there is a need to improve performance for certain combinations of load operations. It is likely, the same is true for store operations. Benchmark (size) Mode Cnt Score Error Units TestLoadSegmentVarious.byteVectorFromByteBackedSegment 1024 avgt 10 280.008 ? 7.251 ns/op TestLoadSegmentVarious.byteVectorFromDoubleBackedSegment 1024 avgt 10 1304.008 ? 98.901 ns/op TestLoadSegmentVarious.byteVectorFromIntBackedSegment 1024 avgt 10 1279.621 ? 100.008 ns/op TestLoadSegmentVarious.doubleVectorFromByteBackedSegment 1024 avgt 10 37.281 ? 1.360 ns/op TestLoadSegmentVarious.doubleVectorFromDoubleBackedSegment 1024 avgt 10 36.847 ? 0.130 ns/op TestLoadSegmentVarious.doubleVectorFromIntBackedSegment 1024 avgt 10 194.195 ? 31.096 ns/op TestLoadSegmentVarious.intVectorFromByteBackedSegment 1024 avgt 10 72.602 ? 1.768 ns/op TestLoadSegmentVarious.intVectorFromDoubleBackedSegment 1024 avgt 10 166.851 ? 9.528 ns/op TestLoadSegmentVarious.intVectorFromIntBackedSegment 1024 avgt 10 71.283 ? 0.507 ns/op TestLoadSegmentVarious.scalarByteVectorFromByteSegment 1024 avgt 10 4790.084 ? 45.882 ns/op TestLoadSegmentVarious.scalarByteVectorFromDoubleSegment 1024 avgt 10 4841.273 ? 291.962 ns/op TestLoadSegmentVarious.scalarByteVectorFromIntSegment 1024 avgt 10 4794.028 ? 101.282 ns/op TestLoadSegmentVarious.scalarDoubleVectorFromByteSegment 1024 avgt 10 1241.117 ? 11.603 ns/op TestLoadSegmentVarious.scalarDoubleVectorFromDoubleSegment 1024 avgt 10 1245.752 ? 15.516 ns/op TestLoadSegmentVarious.scalarDoubleVectorFromIntSegment 1024 avgt 10 1232.216 ? 8.365 ns/op TestLoadSegmentVarious.scalarIntVectorFromByteSegment 1024 avgt 10 1239.146 ? 14.582 ns/op TestLoadSegmentVarious.scalarIntVectorFromDoubleSegment 1024 avgt 10 1236.712 ? 8.063 ns/op TestLoadSegmentVarious.scalarIntVectorFromIntSegment 1024 avgt 10 1228.656 ? 3.329 ns/op As can be seen, vector performance for `IntVector` and `DoubleVector` is good for operations from a byte array or where the array types match. This work can be made under a separate issue. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16360#issuecomment-1785186113