> On 25 Apr 2025, at 12:06, Richard Sandiford <richard.sandif...@arm.com> wrote: > > Kyrylo Tkachov <ktkac...@nvidia.com> writes: >> Hi Richard, >> >>> On 23 Apr 2025, at 13:47, Richard Sandiford <richard.sandif...@arm.com> >>> wrote: >>> >>> Thanks for all the feedback. I've tried to address it in the version >>> below. I'll push later today if there are no further comments. >>> >>> Richard >>> >>> >>> The list is structured as: >>> >>> - new configurations >>> - command-line changes >>> - ACLE changes >>> - everything else >>> >>> As usual, the list of new architectures, CPUs, and features is from a >>> purely mechanical trawl of the associated .def files. I've identified >>> features by their architectural name to try to improve searchability. >>> Similarly, the list of ACLE changes includes the associated ACLE >>> feature macros, again to try to improve searchability. >>> >>> The list summarises some of the target-specific optimisations because >>> it sounded like Tamar had received feedback that people found such >>> information interesting. >>> >>> I've used the passive tense for most entries, to try to follow the >>> style used elsewhere. >>> >>> We don't yet define __ARM_FEATURE_FAMINMAX, but I'll fix that >>> separately. >> >> Thanks again for doing this… >> >>> >>> + </li> >>> + <li>Support has been added for the following features of the Arm C >>> + Language Extensions >>> + (<a href="https://github.com/ARM-software/acle">ACLE</a>): >>> + <ul> >>> + <li>guarded control stacks</li> >>> + <li>lookup table instructions with 2-bit and 4-bit indices >>> + (predefined macro >>> + <code>__ARM_FEATURE_LUT</code>, enabled by <code>+lut</code>) >>> + </li> >>> + <li>floating-point absolute minimum and maximum instructions >>> + (predefined macro <code>__ARM_FEATURE_FAMINMAX</code>, >>> + enabled by <code>+faminmax</code>) >>> + </li> >>> + <li>FP8 conversions (predefined macro >>> + <code>__ARM_FEATURE_FP8</code>, enabled by <code>+fp8</code>) >>> + </li> >>> + <li>FP8 2-way dot product to half precision instructions >>> + (predefined macro <code>__ARM_FEATURE_FP8DOT2</code>, >>> + enabled by <code>+fp8dot2</code>) >>> + </li> >>> + <li>FP8 4-way dot product to single precision instructions >>> + (predefined macro <code>__ARM_FEATURE_FP8DOT4</code>, >>> + enabled by <code>+fp8dot4</code>) >>> + </li> >>> + <li>FP8 multiply-accumulate to half precision and single precision >>> + instructions (predefined macro <code>__ARM_FEATURE_FP8FMA</code>, >>> + enabled by <code>+fp8fma</code>) >>> + </li> >>> + <li>SVE FP8 2-way dot product to half precision instructions >>> + (predefined macro <code>__ARM_FEATURE_SSVE_FP8DOT2</code>, >>> + enabled by <code>+ssve-fp8dot2</code>) >>> + </li> >>> + <li>SVE FP8 4-way dot product to single precision instructions >>> + (predefined macro <code>__ARM_FEATURE_SSVE_FP8DOT4</code>, >>> + enabled by <code>+ssve-fp8dot4</code>) >>> + </li> >>> + <li>SVE FP8 multiply-accumulate to half precision and single >>> precision >>> + instructions (predefined macro >>> <code>__ARM_FEATURE_SSVE_FP8FMA</code>, >>> + enabled by <code>+ssve-fp8fma</code>) >> >> >> … Should these FP8 entries say “SSVE FP8” rather than “SVE FP8”? > > The official description is "SVE(2) ... instructions in Streaming > SVE mode". But yeah, I suppose dropping the "in Streaming SVE mode" > was a mistake. I've pushed the following incremental patch.
Thanks, that looks clearer. Kyrill > > Thanks, > Richard > > > diff --git a/htdocs/gcc-15/changes.html b/htdocs/gcc-15/changes.html > index a71249ff..3cec4ff4 100644 > --- a/htdocs/gcc-15/changes.html > +++ b/htdocs/gcc-15/changes.html > @@ -847,17 +847,20 @@ asm (".text; %cc0: mov %cc2, %%r0; .previous;" > instructions (predefined macro <code>__ARM_FEATURE_FP8FMA</code>, > enabled by <code>+fp8fma</code>) > </li> > - <li>SVE FP8 2-way dot product to half precision instructions > - (predefined macro <code>__ARM_FEATURE_SSVE_FP8DOT2</code>, > - enabled by <code>+ssve-fp8dot2</code>) > + <li>SVE FP8 2-way dot product to half precision instructions in > + Streaming SVE mode (predefined macro > + <code>__ARM_FEATURE_SSVE_FP8DOT2</code>, enabled by > + <code>+ssve-fp8dot2</code>) > </li> > - <li>SVE FP8 4-way dot product to single precision instructions > - (predefined macro <code>__ARM_FEATURE_SSVE_FP8DOT4</code>, > - enabled by <code>+ssve-fp8dot4</code>) > + <li>SVE FP8 4-way dot product to single precision instructions in > + Streaming SVE mode (predefined macro > + <code>__ARM_FEATURE_SSVE_FP8DOT4</code>, enabled by > + <code>+ssve-fp8dot4</code>) > </li> > <li>SVE FP8 multiply-accumulate to half precision and single precision > - instructions (predefined macro > <code>__ARM_FEATURE_SSVE_FP8FMA</code>, > - enabled by <code>+ssve-fp8fma</code>) > + instructions in Streaming SVE mode (predefined macro > + <code>__ARM_FEATURE_SSVE_FP8FMA</code>, enabled by > + <code>+ssve-fp8fma</code>) > </li> > <li>SVE2.1 instructions (predefined macro > <code>__ARM_FEATURE_SVE2p1</code>, enabled by <code>+sve2p1</code>) > -- > 2.43.0 >