Re: [PATCH v2] Document AArch64 changes for GCC 15

Kyrylo Tkachov Fri, 25 Apr 2025 04:28:22 -0700


> On 25 Apr 2025, at 12:06, Richard Sandiford <richard.sandif...@arm.com> wrote:
> 
> Kyrylo Tkachov <ktkac...@nvidia.com> writes:
>> Hi Richard,
>> 
>>> On 23 Apr 2025, at 13:47, Richard Sandiford <richard.sandif...@arm.com> 
>>> wrote:
>>> 
>>> Thanks for all the feedback.  I've tried to address it in the version
>>> below.  I'll push later today if there are no further comments.
>>> 
>>> Richard
>>> 
>>> 
>>> The list is structured as:
>>> 
>>> - new configurations
>>> - command-line changes
>>> - ACLE changes
>>> - everything else
>>> 
>>> As usual, the list of new architectures, CPUs, and features is from a
>>> purely mechanical trawl of the associated .def files.  I've identified
>>> features by their architectural name to try to improve searchability.
>>> Similarly, the list of ACLE changes includes the associated ACLE
>>> feature macros, again to try to improve searchability.
>>> 
>>> The list summarises some of the target-specific optimisations because
>>> it sounded like Tamar had received feedback that people found such
>>> information interesting.
>>> 
>>> I've used the passive tense for most entries, to try to follow the
>>> style used elsewhere.
>>> 
>>> We don't yet define __ARM_FEATURE_FAMINMAX, but I'll fix that
>>> separately.
>> 
>> Thanks again for doing this…
>> 
>>> 
>>> +  </li>
>>> +  <li>Support has been added for the following features of the Arm C
>>> +    Language Extensions
>>> +    (<a href="https://github.com/ARM-software/acle";>ACLE</a>):
>>> +    <ul>
>>> +      <li>guarded control stacks</li>
>>> +      <li>lookup table instructions with 2-bit and 4-bit indices
>>> +        (predefined macro
>>> +        <code>__ARM_FEATURE_LUT</code>, enabled by <code>+lut</code>)
>>> +      </li>
>>> +      <li>floating-point absolute minimum and maximum instructions
>>> +        (predefined macro <code>__ARM_FEATURE_FAMINMAX</code>,
>>> +        enabled by <code>+faminmax</code>)
>>> +      </li>
>>> +      <li>FP8 conversions (predefined macro
>>> +        <code>__ARM_FEATURE_FP8</code>, enabled by <code>+fp8</code>)
>>> +      </li>
>>> +      <li>FP8 2-way dot product to half precision instructions
>>> +        (predefined macro <code>__ARM_FEATURE_FP8DOT2</code>,
>>> +        enabled by <code>+fp8dot2</code>)
>>> +      </li>
>>> +      <li>FP8 4-way dot product to single precision instructions
>>> +        (predefined macro <code>__ARM_FEATURE_FP8DOT4</code>,
>>> +        enabled by <code>+fp8dot4</code>)
>>> +      </li>
>>> +      <li>FP8 multiply-accumulate to half precision and single precision
>>> +        instructions (predefined macro <code>__ARM_FEATURE_FP8FMA</code>,
>>> +        enabled by <code>+fp8fma</code>)
>>> +      </li>
>>> +      <li>SVE FP8 2-way dot product to half precision instructions
>>> +        (predefined macro <code>__ARM_FEATURE_SSVE_FP8DOT2</code>,
>>> +        enabled by <code>+ssve-fp8dot2</code>)
>>> +      </li>
>>> +      <li>SVE FP8 4-way dot product to single precision instructions
>>> +        (predefined macro <code>__ARM_FEATURE_SSVE_FP8DOT4</code>,
>>> +        enabled by <code>+ssve-fp8dot4</code>)
>>> +      </li>
>>> +      <li>SVE FP8 multiply-accumulate to half precision and single 
>>> precision
>>> +        instructions (predefined macro 
>>> <code>__ARM_FEATURE_SSVE_FP8FMA</code>,
>>> +        enabled by <code>+ssve-fp8fma</code>)
>> 
>> 
>> … Should these FP8 entries say “SSVE FP8” rather than “SVE FP8”?
> 
> The official description is "SVE(2) ... instructions in Streaming
> SVE mode".  But yeah, I suppose dropping the "in Streaming SVE mode"
> was a mistake.  I've pushed the following incremental patch.


Thanks, that looks clearer.
Kyrill


> 
> Thanks,
> Richard
> 
> 
> diff --git a/htdocs/gcc-15/changes.html b/htdocs/gcc-15/changes.html
> index a71249ff..3cec4ff4 100644
> --- a/htdocs/gcc-15/changes.html
> +++ b/htdocs/gcc-15/changes.html
> @@ -847,17 +847,20 @@ asm (".text; %cc0: mov %cc2, %%r0; .previous;"
>         instructions (predefined macro <code>__ARM_FEATURE_FP8FMA</code>,
>         enabled by <code>+fp8fma</code>)
>       </li>
> -      <li>SVE FP8 2-way dot product to half precision instructions
> -        (predefined macro <code>__ARM_FEATURE_SSVE_FP8DOT2</code>,
> -        enabled by <code>+ssve-fp8dot2</code>)
> +      <li>SVE FP8 2-way dot product to half precision instructions in
> +        Streaming SVE mode (predefined macro
> +        <code>__ARM_FEATURE_SSVE_FP8DOT2</code>, enabled by
> +        <code>+ssve-fp8dot2</code>)
>       </li>
> -      <li>SVE FP8 4-way dot product to single precision instructions
> -        (predefined macro <code>__ARM_FEATURE_SSVE_FP8DOT4</code>,
> -        enabled by <code>+ssve-fp8dot4</code>)
> +      <li>SVE FP8 4-way dot product to single precision instructions in
> +        Streaming SVE mode (predefined macro
> +        <code>__ARM_FEATURE_SSVE_FP8DOT4</code>, enabled by
> +        <code>+ssve-fp8dot4</code>)
>       </li>
>       <li>SVE FP8 multiply-accumulate to half precision and single precision
> -        instructions (predefined macro 
> <code>__ARM_FEATURE_SSVE_FP8FMA</code>,
> -        enabled by <code>+ssve-fp8fma</code>)
> +        instructions in Streaming SVE mode (predefined macro
> +        <code>__ARM_FEATURE_SSVE_FP8FMA</code>, enabled by
> +        <code>+ssve-fp8fma</code>)
>       </li>
>       <li>SVE2.1 instructions (predefined macro
>         <code>__ARM_FEATURE_SVE2p1</code>, enabled by <code>+sve2p1</code>)
> -- 
> 2.43.0
>

Re: [PATCH v2] Document AArch64 changes for GCC 15

Reply via email to