On Thu, Jul 1, 2021 at 7:10 PM Uros Bizjak <ubiz...@gmail.com> wrote:
>
> [Sorry for double post, gcc-patches address was wrong in original post]
>
> On Thu, Jul 1, 2021 at 7:48 AM liuhongt <hongtao....@intel.com> wrote:
> >
> > Hi:
> >   AVX512FP16 is disclosed, refer to [1].
> >   There're 100+ instructions for AVX512FP16, 67 gcc patches, for the 
> > convenience of review, we divide the 67 patches into 2 major parts.
> >   The first part is 2 patches containing basic support for AVX512FP16 
> > (options, cpuid, _Float16 type, libgcc, etc.), and the second part is 65 
> > patches covering all instructions of AVX512FP16(including intrinsic support 
> > and some optimizations).
> >   There is a problem with the first part, _Float16 is not a C++ standard, 
> > so the front-end does not support this type and its mangling, so we "make 
> > up" a _Float16 type on the back-end and use _DF16 as its mangling. The 
> > purpose of this is to align with llvm side, because llvm C++ FE already 
> > supports _Float16[2].
> >
> > [1] 
> > https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html
> > [2] https://reviews.llvm.org/D33719
>
> Looking through implementation of _Float16 support, I think, there is
> no need for _Float16 support to depend on AVX512FP16.
>
> The compiler is smart enough to use either a named pattern that
> describes the instruction when available or diverts to a library call
> to a soft-fp implementation. So, I think that general _Float16 support
> should be implemented first (similar to _float128) and then upgraded
> with AVX512FP16 specific instructions.
>
> MOVW loads/stores to XMM reg can be emulated with MOVD and a SImode
> secondary_reload register.
>
MOVD is under sse2, so is pinsrw, which means if we want xmm
load/stores for HF, sse2 is the least requirement.
Also we support PEXTRW reg/m16, xmm, imm8 under SSE4_1 under which we
have 16bit direct load/store for HFmode and no need for a secondary
reload.
So for simplicity, can we just restrict _Float16 under sse4_1?
> soft-fp library already includes all the infrastructure to implement
> _Float16 (see half.h), so HFmode basic operations should be trivial to
> implement (I have gone through this exercise personally years ago when
> implementing __float128 soft-fp support).
>
> Looking through the patch 1/2, it looks that a new ABI is introduced,
> where FP16 values are passed through XMM registers, but I don't think
> there is updated psABI documentation available (for x86_64 as well as
> i386, where FP16 values will probably be passed through memory).
>
> So, the net effect of the above proposal(s) is that x86 will support
> _Float16 out-of the box, emulate it via soft-fp without AVX512FP16 and
> use AVX512FP16 instructions with -mavx512fp16.
>
> Uros.



-- 
BR,
Hongtao

Reply via email to