On Mon, 24 Mar 2025 at 15:13, Richard Earnshaw (lists) <richard.earns...@arm.com> wrote: > > On 21/03/2025 17:30, Christophe Lyon wrote: > > On Fri, 21 Mar 2025 at 16:51, Richard Earnshaw (lists) > > <richard.earns...@arm.com> wrote: > >> > >> On 21/03/2025 15:15, Christophe Lyon wrote: > >>> On Fri, 21 Mar 2025 at 15:25, Richard Earnshaw (lists) > >>> <richard.earns...@arm.com> wrote: > >>>> > >>>> On 21/03/2025 14:05, Christophe Lyon wrote: > >>>>> On Fri, 21 Mar 2025 at 11:18, Richard Earnshaw (lists) > >>>>> <richard.earns...@arm.com> wrote: > >>>>>> > >>>>>> On 20/03/2025 16:15, Christophe Lyon wrote: > >>>>>>> Depending on if/how the testing flags are overridden, the first value > >>>>>>> we try("") might not do what we want. > >>>>>>> > >>>>>>> For instance, if the whole testsuite is executed with > >>>>>>> (A) -mthumb -march=armv7-m -mtune=cortex-m3 -mfloat-abi=softfp > >>>>>>> > >>>>>>> bf16_neon_ok is first compiled with > >>>>>>> (A) (B) > >>>>>>> where B = -mcpu=unset -march=armv8.2-a+bf16 > >>>>>>> > >>>>>>> which is accepted, so a testcase like vld2q_lane_bf16_indices_1.c > >>>>>>> is compiled with: > >>>>>>> (A) (C) (B) > >>>>>>> where C = -mfpu=neon -mfloat-abi=softfp -mcpu=unset -march=armv7-a > >>>>>>> -mfpu=neon-fp16 -mfp16-format=ieee > >>>>>>> > >>>>>>> because advsimd-intrinsics.exp has set additional_flags to (C) > >>>>>>> via arm_neon_fp16_ok > >>>>>>> > >>>>>>> So the testcase is compiled with > >>>>>>> [...] -mfpu=neon-fp16 -mcpu=unset -march=armv8.2-a+bf16 > >>>>>>> (thus -mfpu=neon-fp16) and bf16 support is disabled. > >>>>>>> > >>>>>>> The patch replaces "" with -mfpu=auto which matches the intended > >>>>>>> effect of -march=armv8.2-a+bf16 as added by bf16_neon_ok, and the > >>>>>>> testcase is now compiled with > >>>>>>> (A) (C) -mfpu=auto (B) > >>>>>>> > >>>>>>> However, since this effective-target is also used on aarch64 (which > >>>>>>> does not support -mfpu=auto), we do this only on arm. > >>>>>>> > >>>>>>> This patch improves coverage, and makes > >>>>>>> v{ld,st}[234]q_lane_bf16_indices_1.c pass when testsuite flags are > >>>>>>> overridden as described above (e.g. for M-profile). > >>>>>>> > >>>>>>> gcc/testsuite/ > >>>>>>> * lib/target-supports.exp > >>>>>>> (check_effective_target_arm_v8_2a_bf16_neon_ok_nocache): > >>>>>>> Conditionally use -mfpu=auto. > >>>>>>> --- > >>>>>>> gcc/testsuite/lib/target-supports.exp | 9 ++++++++- > >>>>>>> 1 file changed, 8 insertions(+), 1 deletion(-) > >>>>>>> > >>>>>>> diff --git a/gcc/testsuite/lib/target-supports.exp > >>>>>>> b/gcc/testsuite/lib/target-supports.exp > >>>>>>> index e2622a445c5..09b16a14024 100644 > >>>>>>> --- a/gcc/testsuite/lib/target-supports.exp > >>>>>>> +++ b/gcc/testsuite/lib/target-supports.exp > >>>>>>> @@ -6871,12 +6871,19 @@ proc add_options_for_arm_fp16fml_neon { flags > >>>>>>> } { > >>>>>>> proc check_effective_target_arm_v8_2a_bf16_neon_ok_nocache { } { > >>>>>>> global et_arm_v8_2a_bf16_neon_flags > >>>>>>> set et_arm_v8_2a_bf16_neon_flags "" > >>>>>>> + set fpu_auto "" > >>>>>>> > >>>>>>> if { ![istarget arm*-*-*] && ![istarget aarch64*-*-*] } { > >>>>>>> return 0; > >>>>>>> } > >>>>>>> > >>>>>>> - foreach flags {"" "-mfloat-abi=softfp -mfpu=neon-fp-armv8" > >>>>>>> "-mfloat-abi=hard -mfpu=neon-fp-armv8" } { > >>>>>>> + if { [istarget arm*-*-*] } { > >>>>>>> + set fpu_auto "-mfpu=auto" > >>>>>>> + } > >>>>>>> + > >>>>>>> + foreach flags [list "$fpu_auto" \ > >>>>>> > >>>>>> Shouldn't we try first with "", even on Arm? Thus > >>>>>> foreach flags [list "" "$fpu_auto" \ > >>>>>> ... > >>>>>> > >>>>> I don't think so, that's why I tried to explain above. > >>>>> "" is acceptable / accepted in arm_v8_2a_bf16_neon_ok > >>>>> (this is (A) (B) above, where the important parts are: > >>>>> -march=armv7-m -mcpu=unset -march=armv8.2-a+bf16 > >>>>> (so -mfpu is set to the toolchain's default) > >>>> > >>>> That's never going to work reliably. We need to check, somewhere, the > >>>> full set of options we intend to pass to the compilation. We can't > >>>> assume that separately testing if A is ok and B is ok => A + B is ok. > >>>> > >>> > >>> Hmmm I think I raised that problem years ago, because of the way the > >>> test system is designed... > >>> > >>>>> > >>>>> but then the actual testcase is compiled with additional flags (C) > >>>>> defined by the test driver using arm_neon_fp16_ok > >>>>> C = -mfpu=neon -mfloat-abi=softfp -mcpu=unset -march=armv7-a > >>>>> -mfpu=neon-fp16 -mfp16-format=ieee > >>>>> > >>>>> so the relevant parts of (A) (C) (B) are: > >>>>> -march=armv7-m -mfpu=neon -mcpu=unset -march=armv7-a -mfpu=neon-fp16 > >>>>> -mcpu=unset -march=armv8.2-a+bf16 > >>>>> which can be simplified into > >>>>> -mfpu=neon-fp16 -march=armv8.2-a+bf16 > >>>>> > >>>>> I think we need to start with -mfpu=auto instead of "", so that when > >>>>> -march=armv8.2-a+bf16 takes effect, we have cancelled any other -mfpu. > >>>>> > >>>> > >>>> Ideally a test shouldn't add any options in some test runs; that way we > >>>> add some 'base configuration' coverage. We obviously can't do that > >>>> everywhere and there may still be cases where the system can't test > >>>> anything useful at all (hopefully only when linking, or more is needed). > >>>> > >>> > >>> Not sure to follow? If a test wants to check that a given feature > >>> works as expected, it has to use the appropriate options, doesn't it? > >> > >> Not if the base configuration for the run (from the compiler config or > >> from the RUNTEST flags) already supports that feature. > >> > > > > OK, but isn't this error-prone? > > This whole overriding options thing is error prone, unfortunately. I've > sometimes wondered if we should have some directories in the testsuite where > we just suppress all options passed by the user to the framework (so we only > need to consider the options configured into the compiler itself). But even > then we have to think about header file compatibility, so I'm not sure even > that would make things much better. Also, such tests would have to be > limited to compile only (we can't assume the right libs will exist for > linking, much less running the code). > > > > The effective-target unconditionally adds -mcpu=unset -march=XXX > > already, so relying on the implicit default value of -mcpu seems > > inconsistent? > > So perhaps we just have the wrong effective target? > If we consider the one this patch modifies: arm_v8_2a_bf16_neon_ok AFAIU, it looks for the right set of flags needed to have __ARM_FEATURE_BF16_VECTOR_ARITHMETIC defined, so it tries -mfloat-abi / -mfpu combinations in addition to -mcpu=unset -march=armv8.2-a+bf16 It seems consistent with the other numerous effective targets we have?
> > The would also lead to confusion when comparing two logs with the > > exact same options, but different result because the toolchain was > > built with a different --with-fpu= option (which IIRC does not appear > > in gcc.log). > > > > Not sure what accepting "" rather than -mfpu=auto gives us? Given we > > are checking which flags (-mfloat-abi for instance) are need to > > support -march=XXX, don't we want to make sure the corresponding fpu > > is in use? > > See below. > > > > >>> > >>>>> > >>>>> > >>>>>> > >>>>>>> + "-mfloat-abi=softfp -mfpu=neon-fp-armv8" \ > >>>>>>> + "-mfloat-abi=hard -mfpu=neon-fp-armv8" ] { > >>>>>>> if { [check_no_compiler_messages_nocache arm_v8_2a_bf16_neon_ok > >>>>>>> object { > >>>>>>> #include <arm_neon.h> > >>>>>>> #if !defined (__ARM_FEATURE_BF16_VECTOR_ARITHMETIC) > >>>>>> > >>>>>> In fact, since aarch64 doesn't support -mfpu at all, only "" is ever > >>>>>> going to work here, so perhaps we can recast all this code, perhaps > >>>>>> along the lines of that in > >>>>>> check_effective_target_arm_neon_h_ok_nocache so that we don't need to > >>>>>> actually try to include arm_neon.h at all while figuring out the > >>>>>> options. > >>>>>> > >>>>> > >>>>> Maybe. > >>>>> > >>>>> OTOH, actually including arm_neon.h is guaranteed to give the right > >>>>> answer to the question "which flags do we need to be able to include > >>>>> arm_neon.h?", rather than using several conditions that are supposed > >>>>> to match what arm_neon.h does? > >>>>> > >>>> > >>>> Working out whether arm_neon.h can be included can be done with > >>>> arm_neon_h_ok. That's currently Arm only, but it's trivial to extend it > >>>> to aarch64. Once we have that, we can write some rules that build on > >>>> that base to get other features that we might desire. But we must, at > >>>> some point check, rather than assume, that combining options from > >>>> different rules will result in something sane. > >>>> > >>> > >>> I suspect most tests which currently use arm*neon* effective targets > >>> actually include arm_neon.h, they should arm_neon_h_ok. > >>> > >>> But I'm not sure what we can do to make sure, rather than assume that > >>> A + C + B does what we want, given that: > >>> - A is defined by the user who runs the testsuite (via RUNTESTFLAGS / > >>> target_board) > >>> - C is defined by the test harness (advsimd-intrinsics.exp) > >>> - B is defined as needed by individual tests > >>> > >>> At least we have no control on A. > >> > >> We have no control of A, but any require_... tests will add A to the > >> option list they test; so we can override them if needed and check the > >> override works. > > > > Yes, that's already what currently happens. > > > >> C is more complex - we perhaps need some bespoke checks in > >> advsimd-intrinsics.exp that can be used to ensure that it's base flags are > >> added to any tests we run. I need to think about that one a bit; perhaps > >> we can arrange to add these flags to A before the individual tests run > >> their additional checks. > >> > > what is the advantage compared to replacing "" with -mfpu=auto, which > > seems simpler? > > Consider > > -mcpu=cortex-a53 -march=armv8-a+crc -mfpu=neon-fp-armv8 -mfloat-abi=softfp > where would that list come from? Is that user-supplied overrides, or a concatenation of effective targets? > which works just fine. > > If you add -mfpu=auto (overriding the -mfpu=noen-fp-armv8) to that you get > > cc1: warning: switch ‘-mcpu=cortex-a53’ conflicts with switch > ‘-march=armv8-a+crc’ > > because the underlying arch flags don't match any more. If -mtpu=auto would be added by an effective target (like in this patch for instance), it would also add -mcpu=unset before the relevant -march, so that particular sequence shouldn't be generated? -march=armv8-a+crc is added by arm_crc_ok, but it also adds -mcpu=unset, so that would cancel -mcpu=cortex-a53 -mfpu=neon-fp-armv8 can be added by several effective targets, all of which add -mcpu=unset already and if -mfpu=auto alone would cause an error, they would iterate and could select -mfpu=neon-fp-armv8 is it works (arm_v8_neon_ok for instance) > > It's even worse for -mfloat-abi=hard because that will error: > > cc1: error: ‘-mfloat-abi=hard’: selected architecture lacks an FPU > > > > > >>> > >>> We can create a new directory, where the .exp test harness would not > >>> use additional options, and move tests which need > >>> effective-target-arm-arch* there. > >>> (that is 189 files out of 541 under advsimd-intrinsics/) > >>> > >> > >> Those tests might benefit from a refactor anyway - the common arm/aarch64 > >> tests should probably live in dg.target/aarch, and only the > >> aarch64-specific tests should be in dg.target/aarch64. > >> > > Indeed, but isn't this an orthogonal problem? > > > > Thanks, > > > > Christophe > > > >> R. > >> > >>> Thanks, > >>> > >>> Christophe > >>> > >>> > >>>> R. > >>>> > >>>>> Thanks, > >>>>> > >>>>> Christophe > >>>>> > >>>>>> R. > >>>> > >> > > R.