On Tue, 14 Feb 2023 15:19:34 GMT, Claes Redestad <redes...@openjdk.org> wrote:
>> Why? There is no performance difference and the intent is clear. Is this >> just a "style" thing? > > I think with `lessEqual` we'll jump to `L_tailProc` for the final 32-byte > chunk in inputs that are divisible by 32 (starting from 64), no? Using `less` > avoids that, while not affecting performance of any other inputs. As Claes mentioned, this would allow us to do one more iteration of vector loop. ------------- PR: https://git.openjdk.org/jdk/pull/12126