Package: volk
Version: 2.0.0-1
Severity: important
X-debbugs-cc: mmuel...@gnuradio.org

Hi,

While raspbian and Debian armhf have different baselines what they share in
common is that neon is not part of the baseline configuration but is present on
a large proportion of the systems people use in practice (for Debian armhf I
suspect nearly all users have neon, for raspbian it's less because we still
have pi1/pi0 users around) . So where upstream supports runtime detection
of neon said runtime detection should be enabled and used.

Back in 2015 I disabled neon in raspbian's volk package, I can't remember why
but I suspect it was because at the time I had no means of determining whether
the package had runtime CPU   detection.

alle_die_mit_der from gnuradio upstream came into #raspbian on irc (to ask about
options for building stuff) and I took the opportunity to talk about the issue
of runtime CPU detection. He guided me on how to test volk (quotes below) and
I thus decided to revert our raspbian neon-disabling changes and build a package
for testing on raspbian bullseye.

<plugwash> The volk package in raspbian currently has neon disabled. from what 
you have said I strongly suspect it could be re-enabled but before I actually 
re-enable it I need a test plan that I can use to make sure i'm not breaking anything.
<alle_die_mit_der> VOLK has a unit test for every single "kernel"
<alle_die_mit_der> `make test` is your friend :)
<plugwash> is there a way to run the tests against an installed version of volk?
<alle_die_mit_der> yeah
<alle_die_mit_der> `volk_profile` essentially does the same, while benchmarking 
them
<--snip-->
<plugwash> does volk_profile use the same runtime cpu detection as normal use 
of volk?
<alle_die_mit_der> yes
<alle_die_mit_der> it should
<--snip-->
<alle_die_mit_der> you can query the runtime-available platforms with 
`volk-config-info --avail-machines`

However to my surprise I discovered that the package built on raspbian from
unmodified Debian sources didn't have any neon support either. I discovered that
the CMake scripts were failing to detect Neon because they were not using
-mfpu=neon when building test programs.

I have confirmed this is not a raspbian specific issue and it seems to have
been this way since version 2.0.0, this makes it a regression between buster
and bullseye.

I modified the cmake scripts to use -mfpu=neon when detecting neon support
but then the build itself failed with.

cd /volk-2.4.1.new/obj-arm-linux-gnueabihf/lib && /usr/bin/cc -DHAVE_DLFCN_H 
-DHAVE_FENV_H -D_GLIBCXX_USE_CXX11_ABI=1 -I/volk-2.4.1.new/kernels/volk/asm/neon 
-I/usr/include/orc-0.4 -I/volk-2.4.1.new/obj-arm-linux-gnueabihf/include 
-I/volk-2.4.1.new/include -I/volk-2.4.1.new/kernels 
-I/volk-2.4.1.new/obj-arm-linux-gnueabihf/lib -I/volk-2.4.1.new/lib 
-I/usr/include/cpu_features -O2 -g -DNDEBUG -fPIC -o 
CMakeFiles/volk_obj.dir/__/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s.o 
-c /volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s
/volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s: 
Assembler messages:
/volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s:11: 
Error: selected FPU does not support instruction -- `vmov.i32 q12,#0'
/volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s:18: 
Error: selected processor does not support `vld2.16 {d16-d19},[r4]!' in ARM mode
/volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s:22: 
Error: selected FPU does not support instruction -- `vsub.i16 q10,q8,q9'
/volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s:23: 
Error: selected processor does not support `vcge.s16 q11,q10,#0' in ARM mode
/volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s:24: 
Error: selected processor does not support `vcgt.s16 q10,q12,q10' in ARM mode
/volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s:25: 
Error: selected processor does not support `vand.i16 q11,q8,q11' in ARM mode
/volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s:26: 
Error: selected processor does not support `vand.i16 q10,q9,q10' in ARM mode
/volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s:27: 
Error: selected FPU does not support instruction -- `vadd.i16 q10,q11,q10'
/volk-2.4.1.new/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s:28: 
Error: selected processor does not support `vst1.16 {d20-d21},[r12]!' in ARM 
mode
gmake[4]: *** [lib/CMakeFiles/volk_obj.dir/build.make:1780: 
lib/CMakeFiles/volk_obj.dir/__/kernels/volk/asm/neon/volk_16i_max_star_horizontal_16i.s.o]
 Error 1

I could go digging deeper into the build scripts to see if I can figure out how
to make the buildsystem build the neon kernels (but not the generic kernels)
with -mfpu=neon, but I felt it was time to seek advice from those more familiar
with the codebase than me.

Reply via email to