[Bug c/119014] Extending _Float16 constant at compile and run time differs

vincent-gcc at vinc17 dot net via Gcc-bugs Fri, 11 Apr 2025 07:34:59 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119014


--- Comment #24 from Vincent Lefèvre <vincent-gcc at vinc17 dot net> ---
(In reply to Jakub Jelinek from comment #23)
> IMHO -fexcess-precision=16 (at least on x86_64 64-bit and with -mfpmath=sse
> -msse2 32-bit too) are completely conformant modes,

If this is the intent, it should be documented as such (or perhaps it is
-fexcess-precision=fast that is the only non-conformant option).

> it is like 0 (where all
> of float, _Float32, double, _Float64, long double, _Float128 evaluate to the
> precision of their semantic type) but unlike that, _Float16 also has no
> excess precision.  So the right FLT_EVAL_METHOD in C23 with
> __STDC_WANT_IEC_60559_TYPES_EXT__ is IMHO 16.

Note that this is only one of several possibilities allowed by the current
documentation of -fexcess-precision=16. -fexcess-precision=16 is not yet
possible with -mfpmath=387 (e.g. implied by -m32), but if it is implemented in
the future, the alternative could be: float and double are still evaluated in
extended precision like now (whether -fexcess-precision is "fast" or
"standard"), in which case FLT_EVAL_METHOD should be -1; then there are 2
possibilities concerning float and double:

1. Excess precision is removed by a cast or an assignment (i.e. like with
-fexcess-precision=standard). This is conformant (assuming FLT_EVAL_METHOD is
-1).

2. Excess precision is not removed by a cast or an assignment (i.e. like with
-fexcess-precision=fast). This is non-conformant.

Note that behind -fexcess-precision=16, the user might want a strict
implementation of _Float16 while having a more flexible / faster arithmetic for
the larger types.

[Bug c/119014] Extending _Float16 constant at compile and run time differs

Reply via email to