On Wed, 21 Jul 2021, liuhongt via Gcc-patches wrote: > @@ -23254,13 +23337,15 @@ ix86_get_excess_precision (enum > excess_precision_type type) > provide would be identical were it not for the unpredictable > cases. */ > if (!TARGET_80387) > - return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT; > + return TARGET_SSE2 > + ? FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 > + : FLT_EVAL_METHOD_PROMOTE_TO_FLOAT; > else if (!TARGET_MIX_SSE_I387) > { > if (!(TARGET_SSE && TARGET_SSE_MATH)) > return FLT_EVAL_METHOD_PROMOTE_TO_LONG_DOUBLE; > else if (TARGET_SSE2) > - return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT; > + return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16; > } > > /* If we are in standards compliant mode, but we know we will
This patch is not changing the default "fast" mode at all; that's promoting to float, unconditionally. But you have a subsequent change there in patch 4 to make the promotions in the default "fast" mode depend on hardware support for the new instructions; it's unhelpful for the documentation not to corresponding exactly to the code changes in the same patch. Rather than using FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 whenever TARGET_SSE2 (i.e. whenever the type is available), it might make more sense to follow AArch64 and use it only when the hardware instructions are available. In any case, it seems peculiar to use a different threshold in the "fast" case from the "standard" case. -fexcess-precision=standard is not "avoid excess precision", it's "implement excess precision in the front end". Whenever "fast" is implementing excess precision in the front end, "standard" should be doing the same thing as "fast". > +Soft-fp keeps the intermediate result of the operation at 32-bit precision > by defaults, > +which may lead to inconsistent behavior between soft-fp and avx512fp16 > instructions, > +using @option{-fexcess-precision=standard} will force round back after every > operation. "soft-fp" is, as the name of some code within GCC, an internal implementation detail, which should not be referenced in the user manual. What results in intermediate results being in a wider precision is not soft-fp; it's promotions inserted by the front end as a result of how the above hook is defined (promotions inserted by the optabs/expand code are an implementation detail that should always be followed automatically by a truncation of the result and so not be user-visible). As far as I know, the official name of "avx512fp16" is "AVX512-FP16" and text in the manual should use the official capitalization, hyphenation etc. in such names unless literally referring to command-line options inside @option or similar. -- Joseph S. Myers jos...@codesourcery.com