Re: [PATCH v2] Document AArch64 changes for GCC 15

Richard Sandiford Fri, 25 Apr 2025 03:13:03 -0700

Kyrylo Tkachov <ktkac...@nvidia.com> writes:
> Hi Richard,
>
>> On 23 Apr 2025, at 13:47, Richard Sandiford <richard.sandif...@arm.com> 
>> wrote:
>> 
>> Thanks for all the feedback.  I've tried to address it in the version
>> below.  I'll push later today if there are no further comments.
>> 
>> Richard
>> 
>> 
>> The list is structured as:
>> 
>> - new configurations
>> - command-line changes
>> - ACLE changes
>> - everything else
>> 
>> As usual, the list of new architectures, CPUs, and features is from a
>> purely mechanical trawl of the associated .def files.  I've identified
>> features by their architectural name to try to improve searchability.
>> Similarly, the list of ACLE changes includes the associated ACLE
>> feature macros, again to try to improve searchability.
>> 
>> The list summarises some of the target-specific optimisations because
>> it sounded like Tamar had received feedback that people found such
>> information interesting.
>> 
>> I've used the passive tense for most entries, to try to follow the
>> style used elsewhere.
>> 
>> We don't yet define __ARM_FEATURE_FAMINMAX, but I'll fix that
>> separately.
>
> Thanks again for doing this…
>
>> 
>> +  </li>
>> +  <li>Support has been added for the following features of the Arm C
>> +    Language Extensions
>> +    (<a href="https://github.com/ARM-software/acle";>ACLE</a>):
>> +    <ul>
>> +      <li>guarded control stacks</li>
>> +      <li>lookup table instructions with 2-bit and 4-bit indices
>> +        (predefined macro
>> +        <code>__ARM_FEATURE_LUT</code>, enabled by <code>+lut</code>)
>> +      </li>
>> +      <li>floating-point absolute minimum and maximum instructions
>> +        (predefined macro <code>__ARM_FEATURE_FAMINMAX</code>,
>> +        enabled by <code>+faminmax</code>)
>> +      </li>
>> +      <li>FP8 conversions (predefined macro
>> +        <code>__ARM_FEATURE_FP8</code>, enabled by <code>+fp8</code>)
>> +      </li>
>> +      <li>FP8 2-way dot product to half precision instructions
>> +        (predefined macro <code>__ARM_FEATURE_FP8DOT2</code>,
>> +        enabled by <code>+fp8dot2</code>)
>> +      </li>
>> +      <li>FP8 4-way dot product to single precision instructions
>> +        (predefined macro <code>__ARM_FEATURE_FP8DOT4</code>,
>> +        enabled by <code>+fp8dot4</code>)
>> +      </li>
>> +      <li>FP8 multiply-accumulate to half precision and single precision
>> +        instructions (predefined macro <code>__ARM_FEATURE_FP8FMA</code>,
>> +        enabled by <code>+fp8fma</code>)
>> +      </li>
>> +      <li>SVE FP8 2-way dot product to half precision instructions
>> +        (predefined macro <code>__ARM_FEATURE_SSVE_FP8DOT2</code>,
>> +        enabled by <code>+ssve-fp8dot2</code>)
>> +      </li>
>> +      <li>SVE FP8 4-way dot product to single precision instructions
>> +        (predefined macro <code>__ARM_FEATURE_SSVE_FP8DOT4</code>,
>> +        enabled by <code>+ssve-fp8dot4</code>)
>> +      </li>
>> +      <li>SVE FP8 multiply-accumulate to half precision and single precision
>> +        instructions (predefined macro 
>> <code>__ARM_FEATURE_SSVE_FP8FMA</code>,
>> +        enabled by <code>+ssve-fp8fma</code>)
>
>
> … Should these FP8 entries say “SSVE FP8” rather than “SVE FP8”?


The official description is "SVE(2) ... instructions in Streaming
SVE mode".  But yeah, I suppose dropping the "in Streaming SVE mode"
was a mistake.  I've pushed the following incremental patch.

Thanks,
Richard


diff --git a/htdocs/gcc-15/changes.html b/htdocs/gcc-15/changes.html
index a71249ff..3cec4ff4 100644
--- a/htdocs/gcc-15/changes.html
+++ b/htdocs/gcc-15/changes.html
@@ -847,17 +847,20 @@ asm (".text; %cc0: mov %cc2, %%r0; .previous;"
         instructions (predefined macro <code>__ARM_FEATURE_FP8FMA</code>,
         enabled by <code>+fp8fma</code>)
       </li>
-      <li>SVE FP8 2-way dot product to half precision instructions
-        (predefined macro <code>__ARM_FEATURE_SSVE_FP8DOT2</code>,
-        enabled by <code>+ssve-fp8dot2</code>)
+      <li>SVE FP8 2-way dot product to half precision instructions in
+        Streaming SVE mode (predefined macro
+        <code>__ARM_FEATURE_SSVE_FP8DOT2</code>, enabled by
+        <code>+ssve-fp8dot2</code>)
       </li>
-      <li>SVE FP8 4-way dot product to single precision instructions
-        (predefined macro <code>__ARM_FEATURE_SSVE_FP8DOT4</code>,
-        enabled by <code>+ssve-fp8dot4</code>)
+      <li>SVE FP8 4-way dot product to single precision instructions in
+        Streaming SVE mode (predefined macro
+        <code>__ARM_FEATURE_SSVE_FP8DOT4</code>, enabled by
+        <code>+ssve-fp8dot4</code>)
       </li>
       <li>SVE FP8 multiply-accumulate to half precision and single precision
-        instructions (predefined macro <code>__ARM_FEATURE_SSVE_FP8FMA</code>,
-        enabled by <code>+ssve-fp8fma</code>)
+        instructions in Streaming SVE mode (predefined macro
+        <code>__ARM_FEATURE_SSVE_FP8FMA</code>, enabled by
+        <code>+ssve-fp8fma</code>)
       </li>
       <li>SVE2.1 instructions (predefined macro
         <code>__ARM_FEATURE_SVE2p1</code>, enabled by <code>+sve2p1</code>)
-- 
2.43.0

Re: [PATCH v2] Document AArch64 changes for GCC 15

Reply via email to