Il 26/03/2013 09:14, Peter Lieven ha scritto:
> If noone objects I would use is_zero_page_2 and continue with v5 of
> the patch set. As I am ooo for the next 8 days from tomorrow. i
> prefer v3 as it has better performance if the non-zeroness is within
> the 8*sizeof(VECTYPE) bytes and not in the first 256-bit.

Either v2 or v3 is fine.  v3 has slightly simpler code and v2 optimizes
for a rare case, but v2 is indeed a bit faster and your benchmarking
effort should be rewarded. :)

> Paolo, with the version that has lower setup costs in mind shall I
> use the vectorized or the unrolled version of patch 4 (find_next_bit
> optimization)?

I think for that we should, at least for now, use the version we
discussed a few weeks ago (with no SIMD and just unrolling).

Paolo

Reply via email to