pengfei added inline comments.

================
Comment at: clang/lib/CodeGen/CGExprScalar.cpp:1315
+  if ((SrcType->isHalfType() || iSFloat16Allowed) &&
+      !CGF.getContext().getLangOpts().NativeHalfType) {
     // Cast to FP using the intrinsic if the half type itself isn't supported.
----------------
rjmccall wrote:
> pengfei wrote:
> > rjmccall wrote:
> > > pengfei wrote:
> > > > rjmccall wrote:
> > > > > Okay, this condition is pretty ridiculous to be repeating in three 
> > > > > different places across the compiler.  Especially since you're going 
> > > > > to change it when you implement the new option, right?
> > > > > 
> > > > > Can we state this condition more generally?  I'm not sure why this is 
> > > > > so narrowly restricted, and the variable name isn't telling me 
> > > > > anything, since `_Float16` must by definition be "allowed" if we have 
> > > > > an expression of `_Float16` type.
> > > > > since _Float16 must by definition be "allowed" if we have an 
> > > > > expression of _Float16 type.
> > > > 
> > > > _Float16 is allowed only on a few targets. 
> > > > https://clang.llvm.org/docs/LanguageExtensions.html#half-precision-floating-point
> > > > By the way, we should update for X86 since it's not limited to 
> > > > avx512fp16 now.
> > > > _Float16 is allowed only on a few targets.
> > > 
> > > Yes, I know that.  But if `SrcType->isFloat16Type()` is true, we must be 
> > > on one of those targets, because the type doesn't otherwise exist.
> > I see your point now. The problem here is we want to allow the `_Float16` 
> > to be used more broadly. But the target doesn't really support it sometime. 
> > Currently full arithmatic operations are supported only on target with 
> > AVX512FP16.
> > We should cast for those targets without AVX512FP16 while avoid to do on 
> > AVX512FP16.
> I agree that many targets don't natively support arithmetic on this format, 
> but x86 is not the first one that does.  Unless I'm misunderstanding, we 
> already track this property in Clang's TargetInfo as `hasLegalHalfType()`.  
> `+avx512fp16` presumably ought to set this.
> 
> I'm not sure what the interaction with the `NativeHalfType` LangOpt is 
> supposed to be here.  My understanding is that that option is just supposed 
> to affect `__fp16`, basically turning it into a proper arithmetic type, i.e. 
> essentially `_Float16`.  Whatever effect you want to apply to `_Float16` 
> should presumably happen even if that option not set.
> 
> More broadly, I don't think your approach in this patch is correct.  The type 
> of operations on `_Float16` should not change based on whether the target 
> natively supports `_Float16`.  If we need to emulate those operations on 
> targets that don't provide them natively, we should do that at a lower level 
> than the type system.
> 
> The most appropriate place to do that is going to depend on the exact 
> semantics we want.
> 
> If we want to preserve `half` semantics exactly regardless of target, we 
> should have Clang's IR generation actually emit `half` operations.  Targets 
> that don't support those operations natively will have to lower at least some 
> of those operations into compiler-rt calls, but that's not at all 
> unprecedented.
> 
> If we're okay with playing loose for performance reasons, we can promote to 
> `float` immediately around individual arithmetic operations.  IR generation 
> is probably the most appropriate place to do that.  But I'm quite concerned 
> about that making `_Float16` feel like an unpredictable/unportable type; it 
> seems to me that software emulation is much better.
> 
> If you're proposing the latter, I think you need to raise that more widely 
> than a code review; please make a post on llvm-dev.
> we already track this property in Clang's TargetInfo as `hasLegalHalfType()`

That sounds a good approch. Thank you.

> The type of operations on `_Float16` should not change based on whether the 
> target natively supports `_Float16`. If we need to emulate those operations 
> on targets that don't provide them natively, we should do that at a lower 
> level than the type system.

Unfortunately, we can't do it at low level. The reason is (I'm not expert in 
frontend, just recalled from last disscussion with GCC folks) we have to do 
expresssion emulation to respect C/C++ semantics. GCC has option 
`-fexcess-precision=16` to match the same result with native instructions, but 
the default is `-fexcess-precision=fast` according to language semantics.

> The most appropriate place to do that is going to depend on the exact 
> semantics we want...

Note, we are not simply doing emulation in the frontend. It's backend's 
responsibility to emulate a single `half` operation. But it's frontend's 
responsibility to choose whether to emit several `half` operations or emit 
promote + several `float` operations + truncate. As described in the title, 
this patch is doing for the latter.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113107/new/

https://reviews.llvm.org/D113107

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to