On Thu, Jul 01, 2021 at 02:58:01PM +0200, Richard Biener wrote:
> > The main issue is complex _Float16 functions in libgcc.  If _Float16 doesn't
> > require -mavx512fp16, we need to compile complex _Float16 functions in
> > libgcc without -mavx512fp16.  Complex _Float16 performance is very
> > important for our _Float16 usage.   _Float16 performance has to be
> > very fast.  There should be no emulation anywhere when -mavx512fp16
> > is used.   That is why _Float16 is available only with -mavx512fp16.
> 
> It should be possible to emulate scalar _Float16 using _Float32 with a
> reasonable
> performance trade-off.  I think users caring for _Float16 performance will
> use vector intrinsics anyway since for scalar code _Float32 code will likely
> perform the same (at double storage cost)

Only if it is allowed to have excess precision for _Float16.  If not, then
one would need to (expensively?) round after every operation at least.

        Jakub

Reply via email to