rjmccall added a comment. For that example, yes, approach #3 would result in that exact same IR on targets that lack direct hardware support for `_Float16` operations. But getting that behavior right in general requires a different implementation than is provided by this patch, which is implementing approach #4 and inappropriately changing the formal types of expressions.
In contrast, approach #1 would produce IR like this: define dso_local arm_aapcscc half @foo(half %a, half %b, half %c) #0 { entry: %a.addr = alloca half, align 2 %b.addr = alloca half, align 2 %c.addr = alloca half, align 2 store half %a, half* %a.addr, align 2 store half %b, half* %b.addr, align 2 store half %c, half* %c.addr, align 2 %0 = load half, half* %a.addr, align 2 %conv = fpext half %0 to float %1 = load half, half* %b.addr, align 2 %conv1 = fpext half %1 to float %add = fadd float %conv, %conv1 %trunc = fptrunc float %add to half %ext = fpext half %trunc to float %2 = load half, half* %c.addr, align 2 %conv2 = fpext half %2 to float %add3 = fadd float %ext, %conv2 %3 = fptrunc float %add3 to half ret half %3 } I was under the impression that `-fexcess-precision` had some sort of strict mode that forces this pattern, but apparently not, and the choices are just between `standard` (truncation is only forced at casts and assignments) and `fast` (optimizer has free rein to remove truncations). CHANGES SINCE LAST ACTION https://reviews.llvm.org/D113107/new/ https://reviews.llvm.org/D113107 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits