Kyrylo Tkachov <ktkac...@nvidia.com> writes: > Hi Richard, > >> On 23 Apr 2025, at 13:47, Richard Sandiford <richard.sandif...@arm.com> >> wrote: >> >> Thanks for all the feedback. I've tried to address it in the version >> below. I'll push later today if there are no further comments. >> >> Richard >> >> >> The list is structured as: >> >> - new configurations >> - command-line changes >> - ACLE changes >> - everything else >> >> As usual, the list of new architectures, CPUs, and features is from a >> purely mechanical trawl of the associated .def files. I've identified >> features by their architectural name to try to improve searchability. >> Similarly, the list of ACLE changes includes the associated ACLE >> feature macros, again to try to improve searchability. >> >> The list summarises some of the target-specific optimisations because >> it sounded like Tamar had received feedback that people found such >> information interesting. >> >> I've used the passive tense for most entries, to try to follow the >> style used elsewhere. >> >> We don't yet define __ARM_FEATURE_FAMINMAX, but I'll fix that >> separately. > > Thanks again for doing this… > >> >> + </li> >> + <li>Support has been added for the following features of the Arm C >> + Language Extensions >> + (<a href="https://github.com/ARM-software/acle">ACLE</a>): >> + <ul> >> + <li>guarded control stacks</li> >> + <li>lookup table instructions with 2-bit and 4-bit indices >> + (predefined macro >> + <code>__ARM_FEATURE_LUT</code>, enabled by <code>+lut</code>) >> + </li> >> + <li>floating-point absolute minimum and maximum instructions >> + (predefined macro <code>__ARM_FEATURE_FAMINMAX</code>, >> + enabled by <code>+faminmax</code>) >> + </li> >> + <li>FP8 conversions (predefined macro >> + <code>__ARM_FEATURE_FP8</code>, enabled by <code>+fp8</code>) >> + </li> >> + <li>FP8 2-way dot product to half precision instructions >> + (predefined macro <code>__ARM_FEATURE_FP8DOT2</code>, >> + enabled by <code>+fp8dot2</code>) >> + </li> >> + <li>FP8 4-way dot product to single precision instructions >> + (predefined macro <code>__ARM_FEATURE_FP8DOT4</code>, >> + enabled by <code>+fp8dot4</code>) >> + </li> >> + <li>FP8 multiply-accumulate to half precision and single precision >> + instructions (predefined macro <code>__ARM_FEATURE_FP8FMA</code>, >> + enabled by <code>+fp8fma</code>) >> + </li> >> + <li>SVE FP8 2-way dot product to half precision instructions >> + (predefined macro <code>__ARM_FEATURE_SSVE_FP8DOT2</code>, >> + enabled by <code>+ssve-fp8dot2</code>) >> + </li> >> + <li>SVE FP8 4-way dot product to single precision instructions >> + (predefined macro <code>__ARM_FEATURE_SSVE_FP8DOT4</code>, >> + enabled by <code>+ssve-fp8dot4</code>) >> + </li> >> + <li>SVE FP8 multiply-accumulate to half precision and single precision >> + instructions (predefined macro >> <code>__ARM_FEATURE_SSVE_FP8FMA</code>, >> + enabled by <code>+ssve-fp8fma</code>) > > > … Should these FP8 entries say “SSVE FP8” rather than “SVE FP8”?
The official description is "SVE(2) ... instructions in Streaming SVE mode". But yeah, I suppose dropping the "in Streaming SVE mode" was a mistake. I've pushed the following incremental patch. Thanks, Richard diff --git a/htdocs/gcc-15/changes.html b/htdocs/gcc-15/changes.html index a71249ff..3cec4ff4 100644 --- a/htdocs/gcc-15/changes.html +++ b/htdocs/gcc-15/changes.html @@ -847,17 +847,20 @@ asm (".text; %cc0: mov %cc2, %%r0; .previous;" instructions (predefined macro <code>__ARM_FEATURE_FP8FMA</code>, enabled by <code>+fp8fma</code>) </li> - <li>SVE FP8 2-way dot product to half precision instructions - (predefined macro <code>__ARM_FEATURE_SSVE_FP8DOT2</code>, - enabled by <code>+ssve-fp8dot2</code>) + <li>SVE FP8 2-way dot product to half precision instructions in + Streaming SVE mode (predefined macro + <code>__ARM_FEATURE_SSVE_FP8DOT2</code>, enabled by + <code>+ssve-fp8dot2</code>) </li> - <li>SVE FP8 4-way dot product to single precision instructions - (predefined macro <code>__ARM_FEATURE_SSVE_FP8DOT4</code>, - enabled by <code>+ssve-fp8dot4</code>) + <li>SVE FP8 4-way dot product to single precision instructions in + Streaming SVE mode (predefined macro + <code>__ARM_FEATURE_SSVE_FP8DOT4</code>, enabled by + <code>+ssve-fp8dot4</code>) </li> <li>SVE FP8 multiply-accumulate to half precision and single precision - instructions (predefined macro <code>__ARM_FEATURE_SSVE_FP8FMA</code>, - enabled by <code>+ssve-fp8fma</code>) + instructions in Streaming SVE mode (predefined macro + <code>__ARM_FEATURE_SSVE_FP8FMA</code>, enabled by + <code>+ssve-fp8fma</code>) </li> <li>SVE2.1 instructions (predefined macro <code>__ARM_FEATURE_SVE2p1</code>, enabled by <code>+sve2p1</code>) -- 2.43.0