On Wed, 28 Aug 2024 09:06:48 GMT, Per Minborg <pminb...@openjdk.org> wrote:

>> How fast do we need to be here given we are measuring in a few nanoseconds 
>> per operation? 
>> 
>> What if the goal is not to regress from say explicitly filling in a small 
>> sized segment or a comparable array (e.g., < 8 bytes) then maybe a loop 
>> suffices and the code is simple?
>
>> How fast do we need to be here given we are measuring in a few nanoseconds 
>> per operation?
>> 
>> What if the goal is not to regress from say explicitly filling in a small 
>> sized segment or a comparable array (e.g., < 8 bytes) then maybe a loop 
>> suffices and the code is simple?
> 
> Fair question. I have another version (called "patch bits" below) that is 
> based on bit logic (first doing int ops, then short and lastly byte, similar 
> to `ArraySupport::vectorizedMismatch`). This has slightly worse performance 
> but is more scalable and perhaps simpler.
> 
> ![image](https://github.com/user-attachments/assets/292c75aa-0df8-4bb7-b45f-426d0f8470d9)

@minborg Hi! I didn't checked the numbers with the benchmark I've written at 
https://github.com/openjdk/jdk/pull/20712#discussion_r1732802685 which is meant 
to stress the branch predictor (without enough `samples` i.e. past 128K on my 
machine) - can you give it a shot with M1 🙏 ?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/20712#issuecomment-2315685287

Reply via email to