On Fri, 21 Mar 2025 at 15:25, Richard Earnshaw (lists) <richard.earns...@arm.com> wrote: > > On 21/03/2025 14:05, Christophe Lyon wrote: > > On Fri, 21 Mar 2025 at 11:18, Richard Earnshaw (lists) > > <richard.earns...@arm.com> wrote: > >> > >> On 20/03/2025 16:15, Christophe Lyon wrote: > >>> Depending on if/how the testing flags are overridden, the first value > >>> we try("") might not do what we want. > >>> > >>> For instance, if the whole testsuite is executed with > >>> (A) -mthumb -march=armv7-m -mtune=cortex-m3 -mfloat-abi=softfp > >>> > >>> bf16_neon_ok is first compiled with > >>> (A) (B) > >>> where B = -mcpu=unset -march=armv8.2-a+bf16 > >>> > >>> which is accepted, so a testcase like vld2q_lane_bf16_indices_1.c > >>> is compiled with: > >>> (A) (C) (B) > >>> where C = -mfpu=neon -mfloat-abi=softfp -mcpu=unset -march=armv7-a > >>> -mfpu=neon-fp16 -mfp16-format=ieee > >>> > >>> because advsimd-intrinsics.exp has set additional_flags to (C) > >>> via arm_neon_fp16_ok > >>> > >>> So the testcase is compiled with > >>> [...] -mfpu=neon-fp16 -mcpu=unset -march=armv8.2-a+bf16 > >>> (thus -mfpu=neon-fp16) and bf16 support is disabled. > >>> > >>> The patch replaces "" with -mfpu=auto which matches the intended > >>> effect of -march=armv8.2-a+bf16 as added by bf16_neon_ok, and the > >>> testcase is now compiled with > >>> (A) (C) -mfpu=auto (B) > >>> > >>> However, since this effective-target is also used on aarch64 (which > >>> does not support -mfpu=auto), we do this only on arm. > >>> > >>> This patch improves coverage, and makes > >>> v{ld,st}[234]q_lane_bf16_indices_1.c pass when testsuite flags are > >>> overridden as described above (e.g. for M-profile). > >>> > >>> gcc/testsuite/ > >>> * lib/target-supports.exp > >>> (check_effective_target_arm_v8_2a_bf16_neon_ok_nocache): > >>> Conditionally use -mfpu=auto. > >>> --- > >>> gcc/testsuite/lib/target-supports.exp | 9 ++++++++- > >>> 1 file changed, 8 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/gcc/testsuite/lib/target-supports.exp > >>> b/gcc/testsuite/lib/target-supports.exp > >>> index e2622a445c5..09b16a14024 100644 > >>> --- a/gcc/testsuite/lib/target-supports.exp > >>> +++ b/gcc/testsuite/lib/target-supports.exp > >>> @@ -6871,12 +6871,19 @@ proc add_options_for_arm_fp16fml_neon { flags } { > >>> proc check_effective_target_arm_v8_2a_bf16_neon_ok_nocache { } { > >>> global et_arm_v8_2a_bf16_neon_flags > >>> set et_arm_v8_2a_bf16_neon_flags "" > >>> + set fpu_auto "" > >>> > >>> if { ![istarget arm*-*-*] && ![istarget aarch64*-*-*] } { > >>> return 0; > >>> } > >>> > >>> - foreach flags {"" "-mfloat-abi=softfp -mfpu=neon-fp-armv8" > >>> "-mfloat-abi=hard -mfpu=neon-fp-armv8" } { > >>> + if { [istarget arm*-*-*] } { > >>> + set fpu_auto "-mfpu=auto" > >>> + } > >>> + > >>> + foreach flags [list "$fpu_auto" \ > >> > >> Shouldn't we try first with "", even on Arm? Thus > >> foreach flags [list "" "$fpu_auto" \ > >> ... > >> > > I don't think so, that's why I tried to explain above. > > "" is acceptable / accepted in arm_v8_2a_bf16_neon_ok > > (this is (A) (B) above, where the important parts are: > > -march=armv7-m -mcpu=unset -march=armv8.2-a+bf16 > > (so -mfpu is set to the toolchain's default) > > That's never going to work reliably. We need to check, somewhere, the full > set of options we intend to pass to the compilation. We can't assume that > separately testing if A is ok and B is ok => A + B is ok. >
Hmmm I think I raised that problem years ago, because of the way the test system is designed... > > > > but then the actual testcase is compiled with additional flags (C) > > defined by the test driver using arm_neon_fp16_ok > > C = -mfpu=neon -mfloat-abi=softfp -mcpu=unset -march=armv7-a > > -mfpu=neon-fp16 -mfp16-format=ieee > > > > so the relevant parts of (A) (C) (B) are: > > -march=armv7-m -mfpu=neon -mcpu=unset -march=armv7-a -mfpu=neon-fp16 > > -mcpu=unset -march=armv8.2-a+bf16 > > which can be simplified into > > -mfpu=neon-fp16 -march=armv8.2-a+bf16 > > > > I think we need to start with -mfpu=auto instead of "", so that when > > -march=armv8.2-a+bf16 takes effect, we have cancelled any other -mfpu. > > > > Ideally a test shouldn't add any options in some test runs; that way we add > some 'base configuration' coverage. We obviously can't do that everywhere > and there may still be cases where the system can't test anything useful at > all (hopefully only when linking, or more is needed). > Not sure to follow? If a test wants to check that a given feature works as expected, it has to use the appropriate options, doesn't it? > > > > > >> > >>> + "-mfloat-abi=softfp -mfpu=neon-fp-armv8" \ > >>> + "-mfloat-abi=hard -mfpu=neon-fp-armv8" ] { > >>> if { [check_no_compiler_messages_nocache arm_v8_2a_bf16_neon_ok > >>> object { > >>> #include <arm_neon.h> > >>> #if !defined (__ARM_FEATURE_BF16_VECTOR_ARITHMETIC) > >> > >> In fact, since aarch64 doesn't support -mfpu at all, only "" is ever going > >> to work here, so perhaps we can recast all this code, perhaps along the > >> lines of that in check_effective_target_arm_neon_h_ok_nocache so that we > >> don't need to actually try to include arm_neon.h at all while figuring out > >> the options. > >> > > > > Maybe. > > > > OTOH, actually including arm_neon.h is guaranteed to give the right > > answer to the question "which flags do we need to be able to include > > arm_neon.h?", rather than using several conditions that are supposed > > to match what arm_neon.h does? > > > > Working out whether arm_neon.h can be included can be done with > arm_neon_h_ok. That's currently Arm only, but it's trivial to extend it to > aarch64. Once we have that, we can write some rules that build on that base > to get other features that we might desire. But we must, at some point > check, rather than assume, that combining options from different rules will > result in something sane. > I suspect most tests which currently use arm*neon* effective targets actually include arm_neon.h, they should arm_neon_h_ok. But I'm not sure what we can do to make sure, rather than assume that A + C + B does what we want, given that: - A is defined by the user who runs the testsuite (via RUNTESTFLAGS / target_board) - C is defined by the test harness (advsimd-intrinsics.exp) - B is defined as needed by individual tests At least we have no control on A. We can create a new directory, where the .exp test harness would not use additional options, and move tests which need effective-target-arm-arch* there. (that is 189 files out of 541 under advsimd-intrinsics/) Thanks, Christophe > R. > > > Thanks, > > > > Christophe > > > >> R. >