On Fri, Jul 2, 2021 at 4:46 AM Joseph Myers <jos...@codesourcery.com> wrote: > > Some general comments, following what I said on libc-alpha: > > > 1. Can you confirm that the ABI being used for 64-bit, for _Float16 and > _Complex _Float16 argument passing and return, follows the current x86_64 > ABI document? > > > 2. Can you confirm that if you build with this instruction set extension > enabled by default, and run GCC tests for a corresponding (emulated?) > processor, all the existing float16 tests in the testsuite are enabled and > PASS (both compilation and execution) (both 64-bit and 32-bit testing)? > > > 3. There's an active 32-bit ABI mailing list (ia32-...@googlegroups.com). > If you want to support _Float16 in the 32-bit case, please work with it to > get the corresponding ABI documented (using only memory and > general-purpose registers seems like a good idea, so that the ABI can be > supported for the base architecture without depending on SSE registers > being present). In the absence of 32-bit ABI support it might be better > to disable the HFmode support for 32-bit. > > > 4. Support for _Float16 really ought not to depend on whether a particular > instruction set extension is present, just like with other floating-point > types; it makes sense, as an API, for all x86 processors (and like many > APIs, it will be faster on some processors than on others). More specific > points here are: > > (a) Basic arithmetic (+-*/) can be done by converting to SFmode, doing > arithmetic there and converting back to HFmode; the results of doing so > will be correctly rounded. Indeed, I think optabs.c handles that > automatically when operations are available on a wider mode but not on the > desired mode (but you'd need to check carefully that all the expected > conversions do occur). So would different behavior of exceptions between soft-fp and avx512fp16 is acceptable? > > (b) Conversions to/from all other floating-point modes will always be > needed, whether in hardware or in software. > > (c) In the F16C (Ivy Bridge and later) case, where you have hardware > conversions to/from float (only), it's fine to convert to double (or long > double) via float. (On efficiency grounds, widening from HFmode to TFmode > should be a pure software operations, that should be faster than having an > intermediate conversion to SFmode when the SFmode-to-TFmode conversion is > a software operation.) > > (d) In the F16C case (where there are hardware conversions only from > SFmode, not from wider modes), conversion *from* DFmode (or XFmode or > TFmode) to HFmode should be a software operation, to avoid double > rounding; an intermediate conversion to SFmode would be incorrect. > > (e) It's OK for conversions to/from integer modes to go via SFmode > (although I don't know if that's efficient or not). Any case where a > conversion from integer to SFmode is inexact would overflow HFmode, so > there are no double rounding issues. > > (f) In the F16C case, it seems the hardware instructions only work on > vectors, not scalars, so care would need to be taken to use them for > scalar conversions only if the other elements of the vector register are > known to be safe to convert without raising any exceptions (e.g. all zero > bits, or -fno-trapping-math in effect). > > (g) If concerned about efficiency of intermediate truncations on > processors without hardware _Float16 arithmetic, look at > aarch64_excess_precision; you have the option of using excess precision > for _Float16 by default, though that only really helps for C given the > lack of excess precision support in the C++ front end. (Enabling this can > cause trouble for code that only expects C99/C11 values of > FLT_EVAL_METHOD, however; see the -fpermitted-flt-eval-methods option for > more details.) > > > 5. Suppose that in some cases you do disable _Float16 support (whether > that's just for 32-bit until the ABI has been defined, or also in the > absence of instruction set support despite my comments above). Then the > way you do that in this patch series, enabling the type in > ix86_scalar_mode_supported_p and ix86_libgcc_floating_mode_supported_p and > giving an error later in ix86_expand_move, is a bad idea. > > Errors in expanders are generally problematic (they don't have good > location information available). But apart from that, ordinary user code > should be able to tell whether _Float16 is supported by testing whether > e.g. __FLT16_MANT_DIG__ is defined (like float.h does), or by including > float.h (with __STDC_WANT_IEC_60559_TYPES_EXT__ defined) and then testing > whether one of the FLT16_* macros is defined, or in a configure test by > just declaring something using the _Float16 type. Patch 1 changes > check_effective_target_float16 to work around your technique for disabling > _Float16 in ix86_expand_move, but it should be considered a stable user > API that any of the above methods can be used in user code to check for > _Float16 support - user code shouldn't need to know implementation details > that you need to do something that will go through ix86_expand_move to see > whether _Float16 is supported or not (and user code shouldn't need to use > a configure test at all for this, testing FLT16_* after including float.h > should work as a fully portable way of testing it - that's using only ISO > C facilities). > > So enable HFmode in ix86_scalar_mode_supported_p and > ix86_libgcc_floating_mode_supported_p exactly when all operations are > supported in the rest of the compiler - don't enable it there and then > disable it elsewhere, because that will break user code testing for > whether _Float16 is available using FLT16_* macros. > > -- > Joseph S. Myers > jos...@codesourcery.com
-- BR, Hongtao