On Fri, 22 May 2026 16:37:07 GMT, Paul Sandoz <[email protected]> wrote:

>> Xueming Shen has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   Update sleef.md
>
> test/hotspot/jtreg/compiler/vectorapi/TestVectorLibrarySleefUnaryOpAndBinaryOp.java
>  line 56:
> 
>> 54:  *           (os.arch == "riscv64" & vm.cpu.features ~= ".*rvv.*")
>> 55:  * @summary VectorAPI: SLEEF unary and binary math library operations 
>> should be intrinsified.
>> 56:  *          This test would be run on SVML/SLEEF supported platforms 
>> only.
> 
> Suggestion:
> 
>  *          This test is run on SVML/SLEEF supported platforms only.

Yes, the performance concern still exists with SLEEF 3.9.0, but the result is 
data-set sensitive.

I wrote up the details here:
https://cr.openjdk.org/~sherman/8376602-tanh/

Short version: the slowdown still reproduces for the saturation-heavy input 
shape used by the earlier Vector API/SLEEF performance work. For example, with 
the original `2*i` style input on macOS aarch64, `Double128` is still slower 
through SLEEF 3.9.0:

- `loadStoreDouble128Lworld`: fallback `4.054 us/op`, SLEEF `5.529 us/op`
- `opOnlyDouble128Lworld`: fallback `3.159 us/op`, SLEEF `5.854 us/op`

However, this is not a general “SLEEF TANH is slower” result. For active input 
ranges such as `[-1, 1]`, `[-3, 3]`, `[-9, 9]`, and `[-20, 20]`, SLEEF 3.9.0 is 
faster in the same measurements.

The report includes the background from the original SLEEF performance 
discussion, the new benchmark data, additional input shapes, and the likely 
reason for the slowdown: the Java fallback reaches a cheap saturated 
`Math.tanh` path for large inputs, while SLEEF appears to run most of the 
vector computation before selecting the saturated result.

So I kept TANH excluded from the SLEEF-only IR test because the current 
implementation still intentionally rejects it, and the historical performance 
concern still reproduces for that saturated workload. Whether TANH should/could 
be enabled for SLEEF can be evaluated separately in a follow-up PR?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/29703#discussion_r3293344664

Reply via email to