On 21/03/2025 15:15, Christophe Lyon wrote:
> On Fri, 21 Mar 2025 at 15:25, Richard Earnshaw (lists)
> <richard.earns...@arm.com> wrote:
>>
>> On 21/03/2025 14:05, Christophe Lyon wrote:
>>> On Fri, 21 Mar 2025 at 11:18, Richard Earnshaw (lists)
>>> <richard.earns...@arm.com> wrote:
>>>>
>>>> On 20/03/2025 16:15, Christophe Lyon wrote:
>>>>> Depending on if/how the testing flags are overridden, the first value
>>>>> we try("") might not do what we want.
>>>>>
>>>>> For instance, if the whole testsuite is executed with
>>>>> (A) -mthumb -march=armv7-m -mtune=cortex-m3 -mfloat-abi=softfp
>>>>>
>>>>> bf16_neon_ok is first compiled with
>>>>> (A) (B)
>>>>> where B = -mcpu=unset -march=armv8.2-a+bf16
>>>>>
>>>>> which is accepted, so a testcase like vld2q_lane_bf16_indices_1.c
>>>>> is compiled with:
>>>>> (A) (C) (B)
>>>>> where C = -mfpu=neon -mfloat-abi=softfp -mcpu=unset -march=armv7-a 
>>>>> -mfpu=neon-fp16 -mfp16-format=ieee
>>>>>
>>>>> because advsimd-intrinsics.exp has set additional_flags to (C)
>>>>> via arm_neon_fp16_ok
>>>>>
>>>>> So the testcase is compiled with
>>>>> [...] -mfpu=neon-fp16 -mcpu=unset -march=armv8.2-a+bf16
>>>>> (thus -mfpu=neon-fp16) and bf16 support is disabled.
>>>>>
>>>>> The patch replaces "" with -mfpu=auto which matches the intended
>>>>> effect of -march=armv8.2-a+bf16 as added by bf16_neon_ok, and the
>>>>> testcase is now compiled with
>>>>> (A) (C) -mfpu=auto (B)
>>>>>
>>>>> However, since this effective-target is also used on aarch64 (which
>>>>> does not support -mfpu=auto), we do this only on arm.
>>>>>
>>>>> This patch improves coverage, and makes
>>>>> v{ld,st}[234]q_lane_bf16_indices_1.c pass when testsuite flags are
>>>>> overridden as described above (e.g. for M-profile).
>>>>>
>>>>>       gcc/testsuite/
>>>>>       * lib/target-supports.exp
>>>>>       (check_effective_target_arm_v8_2a_bf16_neon_ok_nocache):
>>>>>       Conditionally use -mfpu=auto.
>>>>> ---
>>>>>  gcc/testsuite/lib/target-supports.exp | 9 ++++++++-
>>>>>  1 file changed, 8 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/gcc/testsuite/lib/target-supports.exp 
>>>>> b/gcc/testsuite/lib/target-supports.exp
>>>>> index e2622a445c5..09b16a14024 100644
>>>>> --- a/gcc/testsuite/lib/target-supports.exp
>>>>> +++ b/gcc/testsuite/lib/target-supports.exp
>>>>> @@ -6871,12 +6871,19 @@ proc add_options_for_arm_fp16fml_neon { flags } {
>>>>>  proc check_effective_target_arm_v8_2a_bf16_neon_ok_nocache { } {
>>>>>      global et_arm_v8_2a_bf16_neon_flags
>>>>>      set et_arm_v8_2a_bf16_neon_flags ""
>>>>> +    set fpu_auto ""
>>>>>
>>>>>      if { ![istarget arm*-*-*] && ![istarget aarch64*-*-*] } {
>>>>>       return 0;
>>>>>      }
>>>>>
>>>>> -    foreach flags {"" "-mfloat-abi=softfp -mfpu=neon-fp-armv8" 
>>>>> "-mfloat-abi=hard -mfpu=neon-fp-armv8" } {
>>>>> +    if { [istarget arm*-*-*] } {
>>>>> +     set fpu_auto "-mfpu=auto"
>>>>> +    }
>>>>> +
>>>>> +    foreach flags [list "$fpu_auto" \
>>>>
>>>> Shouldn't we try first with "", even on Arm?  Thus
>>>>        foreach flags [list "" "$fpu_auto" \
>>>> ...
>>>>
>>> I don't think so, that's why I tried to explain above.
>>> "" is acceptable / accepted in arm_v8_2a_bf16_neon_ok
>>> (this is (A) (B) above, where the important parts are:
>>> -march=armv7-m -mcpu=unset -march=armv8.2-a+bf16
>>> (so -mfpu is set to the toolchain's default)
>>
>> That's never going to work reliably.  We need to check, somewhere, the full 
>> set of options we intend to pass to the compilation.  We can't assume that 
>> separately testing if A is ok and B is ok => A + B is ok.
>>
> 
> Hmmm I think I raised that problem years ago, because of the way the
> test system is designed...
> 
>>>
>>> but then the actual testcase is compiled with additional flags (C)
>>> defined by the test driver using arm_neon_fp16_ok
>>> C = -mfpu=neon -mfloat-abi=softfp -mcpu=unset -march=armv7-a
>>> -mfpu=neon-fp16 -mfp16-format=ieee
>>>
>>> so the relevant parts of (A) (C) (B) are:
>>> -march=armv7-m  -mfpu=neon -mcpu=unset -march=armv7-a -mfpu=neon-fp16
>>> -mcpu=unset -march=armv8.2-a+bf16
>>> which can be simplified into
>>> -mfpu=neon-fp16 -march=armv8.2-a+bf16
>>>
>>> I think we need to start with -mfpu=auto instead of "", so that when
>>> -march=armv8.2-a+bf16 takes effect, we have cancelled any other -mfpu.
>>>
>>
>> Ideally a test shouldn't add any options in some test runs; that way we add 
>> some 'base configuration' coverage.  We obviously can't do that everywhere 
>> and there may still be cases where the system can't test anything useful at 
>> all (hopefully only when linking, or more is needed).
>>
> 
> Not sure to follow? If a test wants to check that a given feature
> works as expected, it has to use the appropriate options, doesn't it?

Not if the base configuration for the run (from the compiler config or from the 
RUNTEST flags) already supports that feature.

> 
> 
>>>
>>>
>>>>
>>>>> +                    "-mfloat-abi=softfp -mfpu=neon-fp-armv8" \
>>>>> +                    "-mfloat-abi=hard -mfpu=neon-fp-armv8" ] {
>>>>>       if { [check_no_compiler_messages_nocache arm_v8_2a_bf16_neon_ok 
>>>>> object {
>>>>>           #include <arm_neon.h>
>>>>>           #if !defined (__ARM_FEATURE_BF16_VECTOR_ARITHMETIC)
>>>>
>>>> In fact, since aarch64 doesn't support -mfpu at all, only "" is ever going 
>>>> to work here, so perhaps we can recast all this code, perhaps along the 
>>>> lines of that in check_effective_target_arm_neon_h_ok_nocache so that we 
>>>> don't need to actually try to include arm_neon.h at all while figuring out 
>>>> the options.
>>>>
>>>
>>> Maybe.
>>>
>>> OTOH, actually including arm_neon.h is guaranteed to give the right
>>> answer to the question "which flags do we need to be able to include
>>> arm_neon.h?", rather than using several conditions that are supposed
>>> to match what arm_neon.h does?
>>>
>>
>> Working out whether arm_neon.h can be included can be done with 
>> arm_neon_h_ok.  That's currently Arm only, but it's trivial to extend it to 
>> aarch64.  Once we have that, we can write some rules that build on that base 
>> to get other features that we might desire.  But we must, at some point 
>> check, rather than assume, that combining options from different rules will 
>> result in something sane.
>>
> 
> I suspect most tests which currently use arm*neon* effective targets
> actually include arm_neon.h, they should arm_neon_h_ok.
> 
> But I'm not sure what we can do to make sure, rather than assume that
> A + C + B does what we want, given that:
> - A is defined by the user who runs the testsuite (via RUNTESTFLAGS /
> target_board)
> - C is defined by the test harness (advsimd-intrinsics.exp)
> - B is defined as needed by individual tests
> 
> At least we have no control on A.

We have no control of A, but any require_... tests will add A to the option 
list they test; so we can override them if needed and check the override works.

C is more complex - we perhaps need some bespoke checks in 
advsimd-intrinsics.exp that can be used to ensure that it's base flags are 
added to any tests we run.  I need to think about that one a bit; perhaps we 
can arrange to add these flags to A before the individual tests run their 
additional checks.

> 
> We can create a new directory, where the .exp test harness would not
> use additional options, and move tests which need
> effective-target-arm-arch* there.
> (that is 189 files out of 541 under advsimd-intrinsics/)
> 

Those tests might benefit from a refactor anyway - the common arm/aarch64 tests 
should probably live in dg.target/aarch, and only the aarch64-specific tests 
should be in dg.target/aarch64.

R.

> Thanks,
> 
> Christophe
> 
> 
>> R.
>>
>>> Thanks,
>>>
>>> Christophe
>>>
>>>> R.
>>

Reply via email to