It seems Clang doesn't support -fexcess-precision=xxx:
https://github.com/llvm/llvm-project/blob/main/clang/test/Driver/clang_f_opts.c#L403

Thanks
Pengfei

-----Original Message-----
From: Hongtao Liu <crazy...@gmail.com> 
Sent: Thursday, July 15, 2021 2:35 PM
To: Wang, Pengfei <pengfei.w...@intel.com>
Cc: Craig Topper <craig.top...@gmail.com>; Jakub Jelinek <ja...@redhat.com>; 
Liu, Hongtao <hongtao....@intel.com>; gcc-patches@gcc.gnu.org; Joseph Myers 
<jos...@codesourcery.com>
Subject: Re: [llvm-dev] [PATCH 0/2] Initial support for AVX512FP16

On Thu, Jul 15, 2021 at 10:07 AM Wang, Pengfei <pengfei.w...@intel.com> wrote:
>
> Clang for AArch64 promotes each individual operation and rounds immediately 
> afterwards. https://godbolt.org/z/qzGfv6nvo note the fcvts between the two 
> fadd operations. It's implemented in the LLVM backend where we can't see what 
> was originally a single expression.
>
>
>
> Yes, but this is not consistent with Clang document. I think we should ask 
> Clang FE to do the promotion and truncation.
>
>
>
> Thanks
>
> Pengfei
>
>
>
> From: llvm-dev <llvm-dev-boun...@lists.llvm.org> On Behalf Of Craig 
> Topper via llvm-dev
> Sent: Wednesday, July 14, 2021 11:32 PM
> To: Hongtao Liu <crazy...@gmail.com>
> Cc: Jakub Jelinek <ja...@redhat.com>; llvm-dev 
> <llvm-...@lists.llvm.org>; Liu, Hongtao <hongtao....@intel.com>; 
> gcc-patches@gcc.gnu.org; Joseph Myers <jos...@codesourcery.com>
> Subject: Re: [llvm-dev] [PATCH 0/2] Initial support for AVX512FP16
>
>
>
> On Wed, Jul 14, 2021 at 12:45 AM Hongtao Liu via llvm-dev 
> <llvm-...@lists.llvm.org> wrote:
>
> > >
> > Set excess_precision_type to FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 to 
> > round after each operation could keep semantics right.
> > And I'll document the behavior difference between soft-fp and
> > AVX512FP16 instruction for exceptions.
> I got some feedback from my colleague who's working on supporting
> _Float16 for llvm.
> The LLVM side wants to set  FLT_EVAL_METHOD_PROMOTE_TO_FLOAT for 
> soft-fp so that codes can be more efficient.
> i.e.
> _Float16 a, b, c, d;
> d = a + b + c;
>
> would be transformed to
> float tmp, tmp1, a1, b1, c1;
> a1 = (float) a;
> b1 = (float) b;
> c1 = (float) c;
> tmp = a1 + b1;
> tmp1 = tmp + c1;
> d = (_Float16) tmp;
>
> so there's only 1 truncation in the end.
>
> if users want to round back after every operation. codes should be 
> explicitly written as
> _Float16 a, b, c, d, e;
> e = a + b;
> d = e + c;
>
> That's what Clang does, quote from [1]
>  _Float16 arithmetic will be performed using native half-precision 
> support when available on the target (e.g. on ARMv8.2a); otherwise it 
> will be performed at a higher precision (currently always float) and 
> then truncated down to _Float16. Note that C and C++ allow 
> intermediate floating-point operands of an expression to be computed 
> with greater precision than is expressible in their type, so Clang may 
> avoid intermediate truncations in certain cases; this may lead to 
> results that are inconsistent with native arithmetic.
>
>
>
> Clang for AArch64 promotes each individual operation and rounds immediately 
> afterwards. https://godbolt.org/z/qzGfv6nvo note the fcvts between the two 
> fadd operations. It's implemented in the LLVM backend where we can't see what 
> was originally a single expression.
>
>
When i'm reading option documents for excess-precision from 
https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

-fexcess-precision=style

This option allows further control over excess precision on machines where 
floating-point operations occur in a format with more precision or range than 
the IEEE standard and interchange floating-point types.
By default, -fexcess-precision=fast is in effect; this means that operations 
may be carried out in a wider precision than the types specified in the source 
if that would result in faster code, and it is unpredictable when rounding to 
the types specified in the source code takes place. When compiling C, if 
-fexcess-precision=standard is specified then excess precision follows the 
rules specified in ISO C99; in particular, both casts and assignments cause 
values to be rounded to their semantic types (whereas -ffloat-store only 
affects assignments). This option is enabled by default for C if a strict 
conformance option such as -std=c99 is used. -ffast-math enables 
-fexcess-precision=fast by default regardless of whether a strict conformance 
option is used.

For -fexcess-precision=fast,
 we should set flt_eval_mathond to FLT_EVAL_METHOD_PROMOTE_TO_FLOAT for 
soft-fp, and FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 for AVX512FP16

For  -fexcess-precision=standard
set FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 when TARGET_SSE2? so for soft-fp it will 
round back after every operation?
>
>
> and so does arm gcc
> quote from arm.c
>
> /* We can calculate either in 16-bit range and precision or
>    32-bit range and precision.  Make that decision based on whether
>    we have native support for the ARMv8.2-A 16-bit floating-point
>    instructions or not.  */
> return (TARGET_VFP_FP16INST
> ? FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16
> : FLT_EVAL_METHOD_PROMOTE_TO_FLOAT);
>
>
> [1]https://clang.llvm.org/docs/LanguageExtensions.html
> > > --
> > > Joseph S. Myers
> > > jos...@codesourcery.com
> >
> >
> >
> > --
> > BR,
> > Hongtao
>
>
>
> --
> BR,
> Hongtao
> _______________________________________________
> LLVM Developers mailing list
> llvm-...@lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev



--
BR,
Hongtao

Reply via email to