On Fri, Jul 30, 2021 at 5:30 AM Joseph Myers <jos...@codesourcery.com> wrote: > > On Thu, 29 Jul 2021, Hongtao Liu via Gcc-patches wrote: > > > > Rather than using FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 whenever TARGET_SSE2 > > > (i.e. whenever the type is available), it might make more sense to follow > > > AArch64 and use it only when the hardware instructions are available. In > > > any case, it seems peculiar to use a different threshold in the "fast" > > We want to provide some debuggability to the software emulation. > > When there's inconsistency between software emulation and hardware > > instructions, users can still debug on non-avx512fp16 processor w/ > > software emulation and extra option -fexcess-precision=standard, > > But that's not the purpose of -fexcess-precision=standard. The purpose is > only: when the default case is non-conforming, make it conforming instead. > The default case is non-conforming only when the back end has insn > patterns pretending to be able to do arithmetic on formats it can't > actually do arithmetic on - that is, x87 arithmetic where the insn > patterns pretend to support SFmode and DFmode arithmetic but actually use > XFmode (and the similar issue for older m68k, but that back end doesn't > actually have the required support for -fexcess-precision=standard). > > So -fexcess-precision=standard should not do anything different from > -fexcess-precision=fast regarding _Float16. > It make perfect sense. > If you want to be able to enable or disable excess precision for _Float16 > separately from the underlying hardware support, that might provide a case > for supporting extra options, say -fexcess-precision=16 that means follow > the semantics of FLT_EVAL_METHOD == 16 (and with an error for that option > on architectures where the given FLT_EVAL_METHOD value isn't supported). > But that shouldn't be done by making -fexcess-precision=standard do > something outside its scope. > > > Also since TARGET_C_EXCESS_PRECISION is not related to type, for > > testcase w/o _Float16 and is supposed to be runned on x86 fpu, if gcc > > is built w/ --with-arch=sapphirerapid, it will regress those > > testcases. .i.e. gcc.target/i386/excess-precision-*.c, that's why we > > can't follow AArch64. > > Those tests use -mfpmath=387. > > In the -mfpmath=387 case, it seems reasonable to keep the rule of > promoting to long double, regardless of hardware _Float16 support (-msse2 > must also be in effect for the type to be supported at all by the back > end). It's the -mfpmath=sse case for which I think following AArch64 is > appropriate. So does this. > > -- > Joseph S. Myers > jos...@codesourcery.com
I'll add an extra option -fexcess-precision=16 to set FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 when backend supports _Float16. Also and refine ix86_get_excess_precision as @@ -23327,14 +23382,18 @@ ix86_get_excess_precision (enum excess_precision_type type) /* The fastest type to promote to will always be the native type, whether that occurs with implicit excess precision or otherwise. */ - return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT; + return TARGET_AVX512FP16 + ? FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 + : FLT_EVAL_METHOD_PROMOTE_TO_FLOAT; case EXCESS_PRECISION_TYPE_STANDARD: case EXCESS_PRECISION_TYPE_IMPLICIT: /* Otherwise, the excess precision we want when we are in a standards compliant mode, and the implicit precision we provide would be identical were it not for the unpredictable cases. */ - if (!TARGET_80387) + if (TARGET_AVX512FP16 && TARGET_SSE_MATH) + return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16; + else if (!TARGET_80387) return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT; else if (!TARGET_MIX_SSE_I387) { Will update in my next version. -- BR, Hongtao