On Thu, Jul 01, 2021 at 02:58:01PM +0200, Richard Biener wrote: > > The main issue is complex _Float16 functions in libgcc. If _Float16 doesn't > > require -mavx512fp16, we need to compile complex _Float16 functions in > > libgcc without -mavx512fp16. Complex _Float16 performance is very > > important for our _Float16 usage. _Float16 performance has to be > > very fast. There should be no emulation anywhere when -mavx512fp16 > > is used. That is why _Float16 is available only with -mavx512fp16. > > It should be possible to emulate scalar _Float16 using _Float32 with a > reasonable > performance trade-off. I think users caring for _Float16 performance will > use vector intrinsics anyway since for scalar code _Float32 code will likely > perform the same (at double storage cost)
Only if it is allowed to have excess precision for _Float16. If not, then one would need to (expensively?) round after every operation at least. Jakub