On Fri, 27 Sep 2024 14:44:35 GMT, Scott Gibbons <[email protected]> wrote:
>> I like to have the functional connection: if - for whatever reason - the
>> array base offset is smaller than 16, we need to deal with that. The reason
>> for this happens to be `UseCompactObjectHeaders`, but that may not be clear
>> to the reader of the code. I could add an `assert(UseCompactObjectHeaders`
>> in that branch to make that connection clear. Also consider that
>> `UseCompactObjectHeaders` is intended to go away at some point.
>>
>> I wonder if having 2 or 3 branches ahead of the main-loop (which probably
>> doesn't do much, because haystack is <=32 bytes) is a useful approach, or if
>> there may be a better way to get the bytes on the stack? I don't know enough
>> about the implementation to make that judgement.
>
> I believe the code in the patch is good enough as-is, especially if
> `UseCompactObjectHeaders` is slated to go away. The existing `if` will
> prevent the < 16 byte header code from being emitted, which is the desired
> behavior - i.e., if the header size is >= 16, there will be no code emitted
> to the intrinsic for that block. So there will not be an additional branch
> for the code when it is executed.
>
> I'm good with a comment tying `UseCompactObjectHeaders` to the condition.
> The comment can be removed when the flag is removed. "Ship it" :-)
Wait a second, I've probably not been clear. `UseCompactObjectHeaders` is
slated to become *on by default* and then slated to go away. That means that
array base offets <= 16 bytes will become the default. The generated code will
be something like:
if (haystack_len <= 8) {
// Copy 8 bytes onto stack
} else if (haystack_len <= 16) {
// Copy 16 bytes onto stack
} else {
// Copy 32 bytes onto stack
}
So that is 2 branches in this prologue code instead of originally 1.
However, I just noticed that what I proposed is not enough. Consider what
happens when haystack_len is 17. This would take the last case and copy 32
bytes. But we only have 17+8=25 bytes that we can guarantee to be available for
copying. If this happens to be the array at the very beginning of the heap
(very rare/unlikely), this would segfault.
I think I need to mull over it some more to come up with a correct fix.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/20677#discussion_r1778874906