Re: [PATCH 0/2] Initial support for AVX512FP16

Hongtao Liu via Gcc-patches Tue, 06 Jul 2021 01:48:38 -0700

On Fri, Jul 2, 2021 at 4:46 AM Joseph Myers <jos...@codesourcery.com> wrote:
>
> Some general comments, following what I said on libc-alpha:
>
>
> 1. Can you confirm that the ABI being used for 64-bit, for _Float16 and
> _Complex _Float16 argument passing and return, follows the current x86_64
> ABI document?
>
>
> 2. Can you confirm that if you build with this instruction set extension
> enabled by default, and run GCC tests for a corresponding (emulated?)
> processor, all the existing float16 tests in the testsuite are enabled and
> PASS (both compilation and execution) (both 64-bit and 32-bit testing)?
>
>
> 3. There's an active 32-bit ABI mailing list (ia32-...@googlegroups.com).
> If you want to support _Float16 in the 32-bit case, please work with it to
> get the corresponding ABI documented (using only memory and
> general-purpose registers seems like a good idea, so that the ABI can be
> supported for the base architecture without depending on SSE registers
> being present).  In the absence of 32-bit ABI support it might be better
> to disable the HFmode support for 32-bit.
>
>
> 4. Support for _Float16 really ought not to depend on whether a particular
> instruction set extension is present, just like with other floating-point
> types; it makes sense, as an API, for all x86 processors (and like many
> APIs, it will be faster on some processors than on others).  More specific
> points here are:
>
> (a) Basic arithmetic (+-*/) can be done by converting to SFmode, doing
> arithmetic there and converting back to HFmode; the results of doing so
> will be correctly rounded.  Indeed, I think optabs.c handles that
> automatically when operations are available on a wider mode but not on the
> desired mode (but you'd need to check carefully that all the expected
> conversions do occur).
So would different behavior of exceptions between soft-fp and
avx512fp16 is acceptable?
>
> (b) Conversions to/from all other floating-point modes will always be
> needed, whether in hardware or in software.
>
> (c) In the F16C (Ivy Bridge and later) case, where you have hardware
> conversions to/from float (only), it's fine to convert to double (or long
> double) via float.  (On efficiency grounds, widening from HFmode to TFmode
> should be a pure software operations, that should be faster than having an
> intermediate conversion to SFmode when the SFmode-to-TFmode conversion is
> a software operation.)
>
> (d) In the F16C case (where there are hardware conversions only from
> SFmode, not from wider modes), conversion *from* DFmode (or XFmode or
> TFmode) to HFmode should be a software operation, to avoid double
> rounding; an intermediate conversion to SFmode would be incorrect.
>
> (e) It's OK for conversions to/from integer modes to go via SFmode
> (although I don't know if that's efficient or not).  Any case where a
> conversion from integer to SFmode is inexact would overflow HFmode, so
> there are no double rounding issues.
>
> (f) In the F16C case, it seems the hardware instructions only work on
> vectors, not scalars, so care would need to be taken to use them for
> scalar conversions only if the other elements of the vector register are
> known to be safe to convert without raising any exceptions (e.g. all zero
> bits, or -fno-trapping-math in effect).
>
> (g) If concerned about efficiency of intermediate truncations on
> processors without hardware _Float16 arithmetic, look at
> aarch64_excess_precision; you have the option of using excess precision
> for _Float16 by default, though that only really helps for C given the
> lack of excess precision support in the C++ front end.  (Enabling this can
> cause trouble for code that only expects C99/C11 values of
> FLT_EVAL_METHOD, however; see the -fpermitted-flt-eval-methods option for
> more details.)
>
>
> 5. Suppose that in some cases you do disable _Float16 support (whether
> that's just for 32-bit until the ABI has been defined, or also in the
> absence of instruction set support despite my comments above).  Then the
> way you do that in this patch series, enabling the type in
> ix86_scalar_mode_supported_p and ix86_libgcc_floating_mode_supported_p and
> giving an error later in ix86_expand_move, is a bad idea.
>
> Errors in expanders are generally problematic (they don't have good
> location information available).  But apart from that, ordinary user code
> should be able to tell whether _Float16 is supported by testing whether
> e.g. __FLT16_MANT_DIG__ is defined (like float.h does), or by including
> float.h (with __STDC_WANT_IEC_60559_TYPES_EXT__ defined) and then testing
> whether one of the FLT16_* macros is defined, or in a configure test by
> just declaring something using the _Float16 type.  Patch 1 changes
> check_effective_target_float16 to work around your technique for disabling
> _Float16 in ix86_expand_move, but it should be considered a stable user
> API that any of the above methods can be used in user code to check for
> _Float16 support - user code shouldn't need to know implementation details
> that you need to do something that will go through ix86_expand_move to see
> whether _Float16 is supported or not (and user code shouldn't need to use
> a configure test at all for this, testing FLT16_* after including float.h
> should work as a fully portable way of testing it - that's using only ISO
> C facilities).
>
> So enable HFmode in ix86_scalar_mode_supported_p and
> ix86_libgcc_floating_mode_supported_p exactly when all operations are
> supported in the rest of the compiler - don't enable it there and then
> disable it elsewhere, because that will break user code testing for
> whether _Float16 is available using FLT16_* macros.
>
> --
> Joseph S. Myers
> jos...@codesourcery.com




-- 
BR,
Hongtao

Re: [PATCH 0/2] Initial support for AVX512FP16

Reply via email to