On Fri, 5 Jan 2024 09:45:11 GMT, Emanuel Peter <epe...@openjdk.org> wrote:

> You are using `VectorMask<Integer> pred = VectorMask.fromLong(ispecies, 
> maskctr++);`. That basically systematically iterates over all masks, which is 
> nice for a correctness test. But that would use different density inside one 
> test run, right? The average over the loop is still at `50%`, correct?
> 
> I was thinking more a run where the percentage over the whole loop is lower 
> than maybe `1%`. That would get us to a point where maybe the branch 
> prediction of non-vectorized code might be faster, what do you think?

An imperative loop compression will check each mask bit to select compressible 
lane. Therefore mask with low or high density of set bits should show similar 
performance.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/17261#discussion_r1444196848

Reply via email to