Richard Henderson <richard.hender...@linaro.org> writes: > Signed-off-by: Richard Henderson <richard.hender...@linaro.org> > --- > > RFC because I've not benchmarked this on real hw, only run it > through qemu for validation. > <snip> > > +#ifdef CONFIG_SVE_OPT > +static unsigned accel_index; > +static void __attribute__((constructor)) init_accel(void) > +{ > + accel_index = (cpuinfo & CPUINFO_SVE ? 2 : 1); > + buffer_is_zero_accel = accel_table[accel_index]; > +}
This really needs to be: - accel_index = (cpuinfo & CPUINFO_SVE ? 2 : 1); + unsigned info = cpuinfo_init(); + accel_index = (info & CPUINFO_SVE ? 2 : 1); because otherwise you are relying on constructor initialisation order and on the Graviton 3 I built on it wasn't detecting the SVE. With that I get this from "perf record ./tests/unit/test-bufferiszero -m thorough" 51.17% test-bufferisze test-bufferiszero [.] buffer_is_zero_sve 18.92% test-bufferisze test-bufferiszero [.] buffer_is_zero_simd 18.02% test-bufferisze test-bufferiszero [.] buffer_is_zero_int_ge256 7.67% test-bufferisze test-bufferiszero [.] buffer_is_zero_ool 4.09% test-bufferisze test-bufferiszero [.] test_1 but as I mentioned before it would be nice to have a proper benchmark for the buffer utils as I'm sure the unit test would be prone to noise. -- Alex Bennée Virtualisation Tech Lead @ Linaro