On Fri, 21 Mar 2025 at 18:30, Christophe Lyon <christophe.l...@linaro.org> wrote: > > On Fri, 21 Mar 2025 at 16:51, Richard Earnshaw (lists) > <richard.earns...@arm.com> wrote: > > > > On 21/03/2025 15:15, Christophe Lyon wrote: > > > On Fri, 21 Mar 2025 at 15:25, Richard Earnshaw (lists) > > > <richard.earns...@arm.com> wrote: > > >> > > >> On 21/03/2025 14:05, Christophe Lyon wrote: > > >>> On Fri, 21 Mar 2025 at 11:18, Richard Earnshaw (lists) > > >>> <richard.earns...@arm.com> wrote: > > >>>> > > >>>> On 20/03/2025 16:15, Christophe Lyon wrote: > > >>>>> Depending on if/how the testing flags are overridden, the first value > > >>>>> we try("") might not do what we want. > > >>>>> > > >>>>> For instance, if the whole testsuite is executed with > > >>>>> (A) -mthumb -march=armv7-m -mtune=cortex-m3 -mfloat-abi=softfp > > >>>>> > > >>>>> bf16_neon_ok is first compiled with > > >>>>> (A) (B) > > >>>>> where B = -mcpu=unset -march=armv8.2-a+bf16 > > >>>>> > > >>>>> which is accepted, so a testcase like vld2q_lane_bf16_indices_1.c > > >>>>> is compiled with: > > >>>>> (A) (C) (B) > > >>>>> where C = -mfpu=neon -mfloat-abi=softfp -mcpu=unset -march=armv7-a > > >>>>> -mfpu=neon-fp16 -mfp16-format=ieee > > >>>>> > > >>>>> because advsimd-intrinsics.exp has set additional_flags to (C) > > >>>>> via arm_neon_fp16_ok > > >>>>> > > >>>>> So the testcase is compiled with > > >>>>> [...] -mfpu=neon-fp16 -mcpu=unset -march=armv8.2-a+bf16 > > >>>>> (thus -mfpu=neon-fp16) and bf16 support is disabled. > > >>>>> > > >>>>> The patch replaces "" with -mfpu=auto which matches the intended > > >>>>> effect of -march=armv8.2-a+bf16 as added by bf16_neon_ok, and the > > >>>>> testcase is now compiled with > > >>>>> (A) (C) -mfpu=auto (B) > > >>>>> > > >>>>> However, since this effective-target is also used on aarch64 (which > > >>>>> does not support -mfpu=auto), we do this only on arm. > > >>>>> > > >>>>> This patch improves coverage, and makes > > >>>>> v{ld,st}[234]q_lane_bf16_indices_1.c pass when testsuite flags are > > >>>>> overridden as described above (e.g. for M-profile). > > >>>>> > > >>>>> gcc/testsuite/ > > >>>>> * lib/target-supports.exp > > >>>>> (check_effective_target_arm_v8_2a_bf16_neon_ok_nocache): > > >>>>> Conditionally use -mfpu=auto. > > >>>>> --- > > >>>>> gcc/testsuite/lib/target-supports.exp | 9 ++++++++- > > >>>>> 1 file changed, 8 insertions(+), 1 deletion(-) > > >>>>> > > >>>>> diff --git a/gcc/testsuite/lib/target-supports.exp > > >>>>> b/gcc/testsuite/lib/target-supports.exp > > >>>>> index e2622a445c5..09b16a14024 100644 > > >>>>> --- a/gcc/testsuite/lib/target-supports.exp > > >>>>> +++ b/gcc/testsuite/lib/target-supports.exp > > >>>>> @@ -6871,12 +6871,19 @@ proc add_options_for_arm_fp16fml_neon { flags > > >>>>> } { > > >>>>> proc check_effective_target_arm_v8_2a_bf16_neon_ok_nocache { } { > > >>>>> global et_arm_v8_2a_bf16_neon_flags > > >>>>> set et_arm_v8_2a_bf16_neon_flags "" > > >>>>> + set fpu_auto "" > > >>>>> > > >>>>> if { ![istarget arm*-*-*] && ![istarget aarch64*-*-*] } { > > >>>>> return 0; > > >>>>> } > > >>>>> > > >>>>> - foreach flags {"" "-mfloat-abi=softfp -mfpu=neon-fp-armv8" > > >>>>> "-mfloat-abi=hard -mfpu=neon-fp-armv8" } { > > >>>>> + if { [istarget arm*-*-*] } { > > >>>>> + set fpu_auto "-mfpu=auto" > > >>>>> + } > > >>>>> + > > >>>>> + foreach flags [list "$fpu_auto" \ > > >>>> > > >>>> Shouldn't we try first with "", even on Arm? Thus > > >>>> foreach flags [list "" "$fpu_auto" \ > > >>>> ... > > >>>> > > >>> I don't think so, that's why I tried to explain above. > > >>> "" is acceptable / accepted in arm_v8_2a_bf16_neon_ok > > >>> (this is (A) (B) above, where the important parts are: > > >>> -march=armv7-m -mcpu=unset -march=armv8.2-a+bf16 > > >>> (so -mfpu is set to the toolchain's default) > > >> > > >> That's never going to work reliably. We need to check, somewhere, the > > >> full set of options we intend to pass to the compilation. We can't > > >> assume that separately testing if A is ok and B is ok => A + B is ok. > > >> > > > > > > Hmmm I think I raised that problem years ago, because of the way the > > > test system is designed... > > > > > >>> > > >>> but then the actual testcase is compiled with additional flags (C) > > >>> defined by the test driver using arm_neon_fp16_ok > > >>> C = -mfpu=neon -mfloat-abi=softfp -mcpu=unset -march=armv7-a > > >>> -mfpu=neon-fp16 -mfp16-format=ieee > > >>> > > >>> so the relevant parts of (A) (C) (B) are: > > >>> -march=armv7-m -mfpu=neon -mcpu=unset -march=armv7-a -mfpu=neon-fp16 > > >>> -mcpu=unset -march=armv8.2-a+bf16 > > >>> which can be simplified into > > >>> -mfpu=neon-fp16 -march=armv8.2-a+bf16 > > >>> > > >>> I think we need to start with -mfpu=auto instead of "", so that when > > >>> -march=armv8.2-a+bf16 takes effect, we have cancelled any other -mfpu. > > >>> > > >> > > >> Ideally a test shouldn't add any options in some test runs; that way we > > >> add some 'base configuration' coverage. We obviously can't do that > > >> everywhere and there may still be cases where the system can't test > > >> anything useful at all (hopefully only when linking, or more is needed). > > >> > > > > > > Not sure to follow? If a test wants to check that a given feature > > > works as expected, it has to use the appropriate options, doesn't it? > > > > Not if the base configuration for the run (from the compiler config or from > > the RUNTEST flags) already supports that feature. > > > > OK, but isn't this error-prone? > The effective-target unconditionally adds -mcpu=unset -march=XXX > already, so relying on the implicit default value of -mcpu seems typo, I meant -mfpu :-)
I was also thinking that since -cpu=unset was introduced to help testing, making sure -march=XXX does what it intended by test, can't -mfpu=auto be considered "equivalent" to -mcpu=unset, but for fpu? (and in fact, maybe put it along with -mcpu=unset, rather than in the list of options why try) > inconsistent? > The would also lead to confusion when comparing two logs with the > exact same options, but different result because the toolchain was > built with a different --with-fpu= option (which IIRC does not appear > in gcc.log). > > Not sure what accepting "" rather than -mfpu=auto gives us? Given we > are checking which flags (-mfloat-abi for instance) are need to > support -march=XXX, don't we want to make sure the corresponding fpu > is in use? > > > > > > >>> > > >>> > > >>>> > > >>>>> + "-mfloat-abi=softfp -mfpu=neon-fp-armv8" \ > > >>>>> + "-mfloat-abi=hard -mfpu=neon-fp-armv8" ] { > > >>>>> if { [check_no_compiler_messages_nocache arm_v8_2a_bf16_neon_ok > > >>>>> object { > > >>>>> #include <arm_neon.h> > > >>>>> #if !defined (__ARM_FEATURE_BF16_VECTOR_ARITHMETIC) > > >>>> > > >>>> In fact, since aarch64 doesn't support -mfpu at all, only "" is ever > > >>>> going to work here, so perhaps we can recast all this code, perhaps > > >>>> along the lines of that in > > >>>> check_effective_target_arm_neon_h_ok_nocache so that we don't need to > > >>>> actually try to include arm_neon.h at all while figuring out the > > >>>> options. > > >>>> > > >>> > > >>> Maybe. > > >>> > > >>> OTOH, actually including arm_neon.h is guaranteed to give the right > > >>> answer to the question "which flags do we need to be able to include > > >>> arm_neon.h?", rather than using several conditions that are supposed > > >>> to match what arm_neon.h does? > > >>> > > >> > > >> Working out whether arm_neon.h can be included can be done with > > >> arm_neon_h_ok. That's currently Arm only, but it's trivial to extend it > > >> to aarch64. Once we have that, we can write some rules that build on > > >> that base to get other features that we might desire. But we must, at > > >> some point check, rather than assume, that combining options from > > >> different rules will result in something sane. > > >> > > > > > > I suspect most tests which currently use arm*neon* effective targets > > > actually include arm_neon.h, they should arm_neon_h_ok. > > > > > > But I'm not sure what we can do to make sure, rather than assume that > > > A + C + B does what we want, given that: > > > - A is defined by the user who runs the testsuite (via RUNTESTFLAGS / > > > target_board) > > > - C is defined by the test harness (advsimd-intrinsics.exp) > > > - B is defined as needed by individual tests > > > > > > At least we have no control on A. > > > > We have no control of A, but any require_... tests will add A to the option > > list they test; so we can override them if needed and check the override > > works. > > Yes, that's already what currently happens. > > > C is more complex - we perhaps need some bespoke checks in > > advsimd-intrinsics.exp that can be used to ensure that it's base flags are > > added to any tests we run. I need to think about that one a bit; perhaps > > we can arrange to add these flags to A before the individual tests run > > their additional checks. > > > what is the advantage compared to replacing "" with -mfpu=auto, which > seems simpler? > > > > > > > We can create a new directory, where the .exp test harness would not > > > use additional options, and move tests which need > > > effective-target-arm-arch* there. > > > (that is 189 files out of 541 under advsimd-intrinsics/) > > > > > > > Those tests might benefit from a refactor anyway - the common arm/aarch64 > > tests should probably live in dg.target/aarch, and only the > > aarch64-specific tests should be in dg.target/aarch64. > > > Indeed, but isn't this an orthogonal problem? > > Thanks, > > Christophe > > > R. > > > > > Thanks, > > > > > > Christophe > > > > > > > > >> R. > > >> > > >>> Thanks, > > >>> > > >>> Christophe > > >>> > > >>>> R. > > >> > >