pengfei added inline comments.

================
Comment at: clang/test/CodeGen/X86/Float16-arithmetic.c:207
+// CHECK-NEXT:    [[EXT:%.*]] = fpext half [[TMP0]] to float
+// CHECK-NEXT:    store float [[EXT]], ptr [[RETVAL]], align 2
+// CHECK-NEXT:    [[TMP1:%.*]] = load half, ptr [[RETVAL]], align 2
----------------
rjmccall wrote:
> zahiraam wrote:
> > zahiraam wrote:
> > > pengfei wrote:
> > > > Not sure if we need a fptrunc and store the half value. The following 
> > > > tests have the same problem.
> > > I think that's what we want?
> > > // CHECK-LABEL: @RealOp(
> > > // CHECK-NEXT:  entry:
> > > // CHECK-NEXT:    [[A_ADDR:%.*]] = alloca half, align 2
> > > // CHECK-NEXT:    store half [[A:%.*]], ptr [[A_ADDR]], align 2
> > > // CHECK-NEXT:    [[TMP0:%.*]] = load half, ptr [[A_ADDR]], align 2
> > > // CHECK-NEXT:    [[EXT:%.*]] = fpext half [[TMP0]] to float
> > > // CHECK-NEXT:    [[UNPROMOTION:%.*]] = fptrunc float [[EXT]] to half
> > > // CHECK-NEXT:    ret half [[UNPROMOTION]]
> > > 
> > > Do you agree? If this is correct, I will make the change the other 
> > > operators.
> > But I feel like we should be returning a float no?  In which case it will 
> > be more tricky (need to calculate the Address with the promoted 
> > elementype)? @rmjccall?
> The function is declared as returning `_Float16`, not `float`.  This is 
> therefore a question about when we're allowed to return a value in greater 
> precision than the declared return type, which raises three sub-questions: 
> one about ABI, one about language semantics, and one about our handling in 
> the implementation.
> 
> The first is about ABI.  This mode is not supposed to be ABI-breaking, so 
> whenever the ABI is in doubt, and the target makes a `_Float16` return 
> incompatible with the ABI of a `float` return, we must use the former.  That 
> means, at the very least, returning from a function with unknown call sites 
> or calling a function with an unknown implementation.  We could potentially 
> eliminate extra truncations here when we fully understand a call; for 
> example, we could change the return type to `float` when the function is 
> internal to a TU and not address-taken, or we could eliminate a trunc+extend 
> pair after inlining.  It is fair to ask whether that's a good idea, however.
> 
> Anyway, concretely we're talking about two ABIs here:
> - On x86_64, `_Float16` and `float` are not returned compatibly: they're both 
> returned in `xmm0`, but the bit patterns are different, and the caller and 
> callee must agree in order to preserve the value.
> - On i386, `_Float16` and `float` *are* returned compatibly: they're both 
> returned in `%st0`, promoted to the 80-bit format.
> 
> Let's assume for a second that we're interested in avoiding truncations in 
> situations where the ABI doesn't limit us.  Then we have a question of 
> language semantics, which is principally about this: does C's authorization 
> of excess precision in intermediate results allows return values to propagate 
> the excess precision?  The answer that appears to be yes, it does, per the 
> explicit footnote at the end of the standard's description of the `return` 
> statement:
> 
> > The return statement is not an assignment. The overlap restriction of 
> > 6.5.16.1 does not apply to the case of function return. The representation 
> > of floating-point values can have wider range or precision than implied by 
> > the type; a cast can be used to remove this extra range and precision.
> 
> Okay.  So the third question is about implementation: how should we take 
> advantage of this flexibility, assuming we actually want to?  A major part of 
> the reason we're doing explicit promoted emission in the frontend in this 
> patch is that only the frontend has the required knowledge of when to force 
> truncation; and a big part of *that* is that explicit casts and assignments 
> both force truncation, and neither has a persistent semantic representation 
> in LLVM IR.  We cannot distinguish between a truncation that was only done to 
> satisfy the ABI and a truncation that was required by the language semantics. 
>  Once we have a value as an `fp16` in IR, it's entirely possible that it 
> underwent an operation that required excess precision to be discarded, one 
> which no longer has any representation in IR.  I think the only reasonable 
> way to avoid this would be to allow functions to directly return `float` (and 
> receive `float`s as returns) with some sort of annotation that it actually 
> has to be returned as an `fp16` to satisfy the ABI.  And within Clang, we 
> would handle that by making the call/return emission interact with the 
> promoted emitters we're adding in this patch.
> 
> We can put off all of that until later, though.  For now, we should just 
> continue to return `_Float16`.
> Do you agree? If this is correct, I will make the change the other operators.

I think it is correct, though it is a bit silly to do fpext + fptrunc for a 
simple move.

> On i386, `_Float16` and `float` *are* returned compatibly

They are not. According to i386 [[ 
https://gitlab.com/x86-psABIs/i386-ABI/-/wikis/Intel386-psABI | psABI ]] both 
`_Float16` and `_Complex _Float16` are passed/returned from XMM0, while 
`float`/`double` etc. are passed/returned from ST0.

Other target may have similar problem, e.g., passing/returning `float` from FPR 
but `_Float16` from GPR.

IIUC, the current way, i.e., always promote + unpromote each expression, is 
just for easy implementation. I expected we can distinguish unary and binary 
operations from multi operations. Only multi operations needs promote + 
unpromote.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113107/new/

https://reviews.llvm.org/D113107

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to