On Fri, 30 Aug 2024 10:51:59 GMT, Per Minborg <pminb...@openjdk.org> wrote:

>> The performance of the `MemorySegment::fil` can be improved by replacing the 
>> `checkAccess()` method call with calling `checkReadOnly()` instead (as the 
>> bounds of the segment itself do not need to be checked).
>> 
>> Also, smaller segments can be handled directly by Java code rather than 
>> transitioning to native code.
>> 
>> Here is how the `MemorySegment::fill` performance is improved by this PR:
>> 
>> ![image](https://github.com/user-attachments/assets/ee29fdf0-a7cf-4d5b-bb6b-278b01d97e3c)
>> 
>> Operations involving 8 or more bytes are delegated to native code whereas 
>> smaller segments are handled via a switch rake.
>> 
>> It should be noted that `Arena::allocate` is using `MemorySegment::fil`. 
>> Hence, this PR will also have a positive effect on memory allocation 
>> performance.
>
> Per Minborg has updated the pull request incrementally with two additional 
> commits since the last revision:
> 
>  - Revert copyright year
>  - Move logic back to AMSI

It is a good analysis; effectively even fill will likely have to handle 
tail/head for reminder bytes - and this will eventually lead to, more or less, 
some branchy code: this can be a tight loop, a series of if and byte per byte 
write (7 ifs), or as it is handled in this pr.
All of these strategies are better than what we have now, probably because the 
existing instrinsics still perform some poor decision, but I haven't dug yet 
into perfasm out to see what it does wrong; maybe is something which could be 
fixed in the intrinsic itself?
Said that, the 3 approaches I have mentioned could be interesting to check 
against both predictable or not workloads, I see pros and cons in all of them, 
TBH, although just as an academic exercise.

One qq; by reading https://bugs.openjdk.org/browse/JDK-8139457 it appears to me 
that via some unsafe mechanism we could avoid being branchy;
If a single byte[] still need to be 8 bytes (or 16?) aligned, we could just use 
long and write past the end of the array? Is it a safe assumption?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/20712#issuecomment-2322414069

Reply via email to