Package: volk Version: 2.0.0-1 Severity: important X-debbugs-cc: mmuel...@gnuradio.org
Hi, While raspbian and Debian armhf have different baselines what they share in common is that neon is not part of the baseline configuration but is present on a large proportion of the systems people use in practice (for Debian armhf I suspect nearly all users have neon, for raspbian it's less because we still have pi1/pi0 users around) . So where upstream supports runtime detection of neon said runtime detection should be enabled and used. Back in 2015 I disabled neon in raspbian's volk package, I can't remember why but I suspect it was because at the time I had no means of determining whether the package had runtime CPU detection. alle_die_mit_der from gnuradio upstream came into #raspbian on irc (to ask about options for building stuff) and I took the opportunity to talk about the issue of runtime CPU detection. He guided me on how to test volk (quotes below) and I thus decided to revert our raspbian neon-disabling changes and build a package for testing on raspbian bullseye.
<plugwash> The volk package in raspbian currently has neon disabled. from what you have said I strongly suspect it could be re-enabled but before I actually re-enable it I need a test plan that I can use to make sure i'm not breaking anything. <alle_die_mit_der> VOLK has a unit test for every single "kernel" <alle_die_mit_der> `make test` is your friend :) <plugwash> is there a way to run the tests against an installed version of volk? <alle_die_mit_der> yeah <alle_die_mit_der> `volk_profile` essentially does the same, while benchmarking them <--snip--> <plugwash> does volk_profile use the same runtime cpu detection as normal use of volk? <alle_die_mit_der> yes <alle_die_mit_der> it should <--snip--> <alle_die_mit_der> you can query the runtime-available platforms with `volk-config-info --avail-machines`
However to my surprise I discovered that the package built on raspbian from unmodified Debian sources didn't have any neon support either. I discovered that the CMake scripts were failing to detect Neon because they were not using -mfpu=neon when building test programs. I have confirmed this is not a raspbian specific issue and it seems to have been this way since version 2.0.0, this makes it a regression between buster and bullseye. I modified the cmake scripts to use -mfpu=neon when detecting neon support but then the build itself failed with.
cd /volk-2.4.1.new/obj-arm-linux-gnueabihf/lib && /usr/bin/cc -DHAVE_DLFCN_H -DHAVE_FENV_H -D_GLIBCXX_USE_CXX11_ABI=1 -I/volk-2.4.1.new/kernels/volk/asm/neon -I/usr/include/orc-0.4 -I/volk-2.4.1.new/obj-arm-linux-gnueabihf/include -I/volk-2.4.1.new/include -I/volk-2.4.1.new/kernels -I/volk-2.4.1.new/obj-arm-linux-gnueabihf/lib -I/volk-2.4.1.new/lib -I/usr/include/cpu_features -O2 -g -DNDEBUG -fPIC -o CMakeFiles/volk_obj.dir/__/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s.o -c /volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s /volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s: Assembler messages: /volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s:11: Error: selected FPU does not support instruction -- `vmov.i32 q12,#0' /volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s:18: Error: selected processor does not support `vld2.16 {d16-d19},[r4]!' in ARM mode /volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s:22: Error: selected FPU does not support instruction -- `vsub.i16 q10,q8,q9' /volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s:23: Error: selected processor does not support `vcge.s16 q11,q10,#0' in ARM mode /volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s:24: Error: selected processor does not support `vcgt.s16 q10,q12,q10' in ARM mode /volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s:25: Error: selected processor does not support `vand.i16 q11,q8,q11' in ARM mode /volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s:26: Error: selected processor does not support `vand.i16 q10,q9,q10' in ARM mode /volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s:27: Error: selected FPU does not support instruction -- `vadd.i16 q10,q11,q10' /volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s:28: Error: selected processor does not support `vst1.16 {d20-d21},[r12]!' in ARM mode gmake[4]: *** [lib/CMakeFiles/volk_obj.dir/build.make:1780: lib/CMakeFiles/volk_obj.dir/__/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s.o] Error 1
I could go digging deeper into the build scripts to see if I can figure out how to make the buildsystem build the neon kernels (but not the generic kernels) with -mfpu=neon, but I felt it was time to seek advice from those more familiar with the codebase than me.