Re: RFR: 8338967: Improve performance for MemorySegment::fill [v5]

2024-08-30 Thread Maurizio Cimadamore
On Fri, 30 Aug 2024 15:31:26 GMT, Francesco Nigro wrote: > good point: relatively to the baseline, nope, cause the new version improve > regardless, even when the new version got high branch misses My feeling is that the intrinsic we have under the hood must be doing some similar branching to

Re: RFR: 8338967: Improve performance for MemorySegment::fill [v5]

2024-08-30 Thread Maurizio Cimadamore
On Fri, 30 Aug 2024 12:15:36 GMT, Per Minborg wrote: >> @minborg Hi! I didn't checked the numbers with the benchmark I've written at >> https://github.com/openjdk/jdk/pull/20712#discussion_r1732802685 which is >> meant to stress the branch predictor (without enough `samples` i.e. past >> 128K

Re: RFR: 8338967: Improve performance for MemorySegment::fill [v5]

2024-08-30 Thread Francesco Nigro
On Fri, 30 Aug 2024 15:21:52 GMT, Maurizio Cimadamore wrote: > in this case, we can't optimize as well, because we have different branches > which get taken or not in a less predictable fashion. Exactly - It has been designed to show the case when the conditions materialize (because are take

Re: RFR: 8338967: Improve performance for MemorySegment::fill [v5]

2024-08-30 Thread Maurizio Cimadamore
On Fri, 30 Aug 2024 12:15:36 GMT, Per Minborg wrote: >> @minborg Hi! I didn't checked the numbers with the benchmark I've written at >> https://github.com/openjdk/jdk/pull/20712#discussion_r1732802685 which is >> meant to stress the branch predictor (without enough `samples` i.e. past >> 128K

Re: RFR: 8338967: Improve performance for MemorySegment::fill [v5]

2024-08-30 Thread Francesco Nigro
On Fri, 30 Aug 2024 12:15:36 GMT, Per Minborg wrote: >> @minborg Hi! I didn't checked the numbers with the benchmark I've written at >> https://github.com/openjdk/jdk/pull/20712#discussion_r1732802685 which is >> meant to stress the branch predictor (without enough `samples` i.e. past >> 128K

Re: RFR: 8338967: Improve performance for MemorySegment::fill [v5]

2024-08-30 Thread Per Minborg
On Wed, 28 Aug 2024 15:32:40 GMT, Francesco Nigro wrote: >>> How fast do we need to be here given we are measuring in a few nanoseconds >>> per operation? >>> >>> What if the goal is not to regress from say explicitly filling in a small >>> sized segment or a comparable array (e.g., < 8 bytes)

Re: RFR: 8338967: Improve performance for MemorySegment::fill [v5]

2024-08-28 Thread Francesco Nigro
On Wed, 28 Aug 2024 09:06:48 GMT, Per Minborg wrote: >> How fast do we need to be here given we are measuring in a few nanoseconds >> per operation? >> >> What if the goal is not to regress from say explicitly filling in a small >> sized segment or a comparable array (e.g., < 8 bytes) then ma

Re: RFR: 8338967: Improve performance for MemorySegment::fill [v5]

2024-08-28 Thread Maurizio Cimadamore
On Wed, 28 Aug 2024 09:06:48 GMT, Per Minborg wrote: > then maybe a loop suffices and the code is simple? We have tried a loop, but sadly the performance is not great if the number of iteration is small. This is due to the fact that long loops are split into two loops, and outer and an inner,

Re: RFR: 8338967: Improve performance for MemorySegment::fill [v5]

2024-08-28 Thread Maurizio Cimadamore
On Wed, 28 Aug 2024 09:06:48 GMT, Per Minborg wrote: > How fast do we need to be here given we are measuring in a few nanoseconds > per operation? The goal here is to be "competitive" with array bulk operations (given arrays do have bound checks as well) across all the segment size spectrum. I

Re: RFR: 8338967: Improve performance for MemorySegment::fill [v5]

2024-08-28 Thread Per Minborg
On Tue, 27 Aug 2024 20:25:46 GMT, Paul Sandoz wrote: > How fast do we need to be here given we are measuring in a few nanoseconds > per operation? > > What if the goal is not to regress from say explicitly filling in a small > sized segment or a comparable array (e.g., < 8 bytes) then maybe a

Re: RFR: 8338967: Improve performance for MemorySegment::fill [v5]

2024-08-27 Thread Paul Sandoz
On Tue, 27 Aug 2024 10:38:46 GMT, Per Minborg wrote: >> The performance of the `MemorySegment::fil` can be improved by replacing the >> `checkAccess()` method call with calling `checkReadOnly()` instead (as the >> bounds of the segment itself do not need to be checked). >> >> Also, smaller seg

Re: RFR: 8338967: Improve performance for MemorySegment::fill [v5]

2024-08-27 Thread Per Minborg
> The performance of the `MemorySegment::fil` can be improved by replacing the > `checkAccess()` method call with calling `checkReadOnly()` instead (as the > bounds of the segment itself do not need to be checked). > > Also, smaller segments can be handled directly by Java code rather than > tr