On Fri, 30 Aug 2024 22:04:39 GMT, Francesco Nigro <d...@openjdk.org> wrote:

> All of these strategies are better than what we have now, probably because 
> the existing instrinsics still perform some poor decision, but I haven't dug 
> yet into perfasm out to see what it does wrong; maybe is something which 
> could be fixed in the intrinsic itself?

I'm no intrinsics expert, but if I had to guess I'd say that the intrinsics we 
have do not specialize for small sizes. Also, the use of vector instructions 
typically comes with additional alignment constraints - meaning that we need a 
pre-loop (and sometimes a post-loop). This logic, while faster for bigger 
sizes, has some drawbacks for smaller sizes.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/20712#issuecomment-2324276883

Reply via email to