On Mon, 31 Oct 2022 13:35:36 GMT, Quan Anh Mai <qa...@openjdk.org> wrote:

>> But doing it forward requires a `reduceLane` on each iteration. It's faster 
>> to do it backward.
>
> No you don't need to, the vector loop can be calculated as:
> 
>     IntVector accumulation = IntVector.zero(INT_SPECIES);
>     for (int i = 0; i < bound; i += INT_SPECIES.length()) {
>         IntVector current = IntVector.load(INT_SPECIES, array, i);
>         accumulation = 
> accumulation.mul(31**(INT_SPECIES.length())).add(current);
>     }
>     return accumulation.mul(IntVector.of(31**INT_SPECIES.length() - 1, ..., 
> 31**2, 31, 1).reduce(ADD);
> 
> Each iteration only requires a multiplication and an addition. The weight of 
> lanes can be calculated just before the reduction operation.

Ok, I can try rewriting as @merykitty suggests and compare. I'm running out of 
time to spend on this right now, though, so I sort of hope we can do this 
experiment as a follow-up RFE.

-------------

PR: https://git.openjdk.org/jdk/pull/10847

Reply via email to