mibintc created this revision.
mibintc added a reviewer: kpn.
Herald added a subscriber: jfb.
mibintc requested review of this revision.
Herald added projects: clang, LLVM.
Herald added a subscriber: llvm-commits.

This is a proposal to add a new clang builtin, __arithmetic_fence.  The purpose 
of the builtin is to provide the user fine control, at the expression level, 
over floating point optimization when -ffast-math (-ffp-model=fast) is enabled. 
 We are also proposing a new clang builtin that provides access to this 
intrinsic, as well as a new clang command line option `-fprotect-parens` that 
will be implemented using this intrinsic.

This is the clang patch corresponding to https://reviews.llvm.org/D99675 which 
proposes a new llvm intrinsic llvm.arith.fence

Note: Preliminary patch, not ready for code review, doesn't yet run end-to-end 
with the llvm patch, and the logic for the new option is not yet written.

Rationale
---------

Some expression transformations that are mathematically correct, such as 
reassociation and distribution, may be incorrect when dealing with finite 
precision floating point.  For example, these two expressions,

  (a + b) + c
  a + (b + c)

are equivalent mathematically in integer arithmetic, but not in floating point. 
 In some floating point (FP) models, the compiler is allowed to make these 
value-unsafe transformations for performance reasons, even when the programmer 
uses parentheses explicitly.  But the compiler must always honor the 
parentheses implied by __arithmetic_fence, regardless of the FP model settings.

Under `–ffp-model=fast`, __arithmetic_fence provides a way to partially enforce 
ordering in an FP expression.

| Original expression           | Transformed expression | Permitted? |
| ----------------------------- | ---------------------- | ---------- |
| (a + b) + c                   | a + (b + c)            | Yes!       |
| __arithmetic_fence(a + b) + c | a + (b + c)            | No!        |
|



NOTE: The __arithmetic_fence serves no purpose in value-safe FP modes like 
`–ffp-model=precise`:  FP expressions are already strictly ordered.



Motivation:
-----------

- The new clang builtin provides clang compatibility with the Intel C++ 
compiler builtin `__fence` which has similar semantics, and likewise enables 
implementation of the option `-fprotect-parens`.
- The new builtin provides the clang programmer control over floating point 
optimizations at the expression level.
- The gnu fortran compiler, gfortran, likewise supports `-fprotect-parens`

Requirements for __arithmetic_fence:
------------------------------------

- There is one operand. The operand type must be scalar floating point, complex 
floating point or vector thereof.
- The return type is the same as the operand type.
- The return value is equivalent to the operand.
- The invocation of __arithmetic_fence is not a C/C++ constant expression, even 
if the operands are constant.

New option `-fprotect-parens`
-----------------------------

- Determines whether the optimizer honors parentheses when floating-point 
expressions are evaluated
- Option defaults to `fno-protect-parens`


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D100118

Files:
  clang/include/clang/Basic/Builtins.def
  clang/include/clang/Basic/DiagnosticSemaKinds.td
  clang/include/clang/Sema/Sema.h
  clang/lib/AST/ExprConstant.cpp
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/Sema/SemaChecking.cpp
  clang/test/CodeGen/arithmetic-fence-builtin.c
  clang/test/Sema/arithmetic-fence-builtin.c
  llvm/include/llvm/IR/IRBuilder.h

Index: llvm/include/llvm/IR/IRBuilder.h
===================================================================
--- llvm/include/llvm/IR/IRBuilder.h
+++ llvm/include/llvm/IR/IRBuilder.h
@@ -897,6 +897,13 @@
     return CreateBinaryIntrinsic(Intrinsic::maximum, LHS, RHS, nullptr, Name);
   }
 
+  /// Create a call to the arithmetic_fence intrinsic.
+  CallInst *CreateArithmeticFence(Value *Val, Type *DstType,
+                                  const Twine &Name = "") {
+    return CreateIntrinsic(Intrinsic::arithmetic_fence, {DstType}, {Val}, nullptr,
+                           Name);
+  }
+
   /// Create a call to the experimental.vector.extract intrinsic.
   CallInst *CreateExtractVector(Type *DstType, Value *SrcVec, Value *Idx,
                                 const Twine &Name = "") {
Index: clang/test/Sema/arithmetic-fence-builtin.c
===================================================================
--- /dev/null
+++ clang/test/Sema/arithmetic-fence-builtin.c
@@ -0,0 +1,33 @@
+// RUN: %clang_cc1 -triple i386-pc-linux-gnu -emit-llvm -o - -verify -x c++ %s
+int v;
+template <typename T> T addT(T a, T b) {
+  T *q = __arithmetic_fence(&a);
+  // expected-error@-1 {{operand of type 'float *' where floating, complex or a vector of such types is required ('float *' invalid)}}
+  // expected-error@-2 {{operand of type 'int *' where floating, complex or a vector of such types is required ('int *' invalid)}}
+  return __arithmetic_fence(a + b);
+  // expected-error@-1 {{operand of type 'int' where floating, complex or a vector of such types is required ('int' invalid)}}
+}
+int addit(int a, int b) {
+  float x, y;
+  typedef struct {
+    int a, b;
+  } stype;
+  stype s;
+  s = __arithmetic_fence(s);    // expected-error {{operand of type 'stype' where floating, complex or a vector of such types is required ('stype' invalid)}}
+  x = __arithmetic_fence(x, y); // expected-error {{too many arguments to function call, expected 1, have 2}}
+  // Complex is supported.
+  _Complex double cd, cd1;
+  cd = __arithmetic_fence(cd1);
+  // Vector is supported.
+  typedef float __v4hi __attribute__((__vector_size__(8)));
+  __v4hi vec1, vec2;
+  vec1 = __arithmetic_fence(vec2);
+
+  v = __arithmetic_fence(a + b); // expected-error {{operand of type 'int' where floating, complex or a vector of such types is required ('int' invalid)}}
+  float f = addT<float>(a, b);   // expected-note {{in instantiation of function template specialization 'addT<float>' requested here}}
+  int i = addT<int>(1, 2);       // expected-note {{in instantiation of function template specialization 'addT<int>' requested here}}
+  constexpr float d = 1.0 + 2.0;
+  constexpr float c = __arithmetic_fence(1.0 + 2.0);
+  // expected-error@-1 {{constexpr variable 'c' must be initialized by a constant expression}}
+  return 0;
+}
Index: clang/test/CodeGen/arithmetic-fence-builtin.c
===================================================================
--- /dev/null
+++ clang/test/CodeGen/arithmetic-fence-builtin.c
@@ -0,0 +1,35 @@
+// Test with fast math
+// RUN: %clang_cc1 -triple i386-pc-linux-gnu -emit-llvm \
+// RUN: -menable-no-infs -menable-no-nans -menable-unsafe-fp-math \
+// RUN: -fno-signed-zeros -mreassociate -freciprocal-math \
+// RUN: -ffp-contract=fast -fno-rounding-math -ffast-math -ffinite-math-only \
+// RUN: -o - %s | FileCheck %s
+//
+// Test with fast math, showing incomplete implementaton for Complex
+// this test fails.
+// RUN: %clang_cc1 -triple i386-pc-linux-gnu -emit-llvm \
+// RUN: -menable-no-infs -menable-no-nans -menable-unsafe-fp-math \
+// RUN: -fno-signed-zeros -mreassociate -freciprocal-math \
+// RUN: -ffp-contract=fast -fno-rounding-math -ffast-math -ffinite-math-only \
+// RUN: -o - -DSHOWBUG %s | FileCheck %s
+//
+// TBD: Add test without fast flags showing llvm intrinsic not created
+int v;
+int addit(float a, float b) {
+//CHECK: define {{.*}} @addit(float %a, float %b) #0 {
+#ifdef SHOWBUG
+  // Assertion fail in clang when try to Emit complex expression
+  // Complex should be supported.
+  _Complex double cd, cd1;
+  cd = __arithmetic_fence(cd1);
+#endif
+  // Vector should be supported.
+  typedef float __v2f32 __attribute__((__vector_size__(8)));
+  __v2f32 vec1, vec2;
+  vec1 = __arithmetic_fence(vec2);
+  // CHECK: call fast <2 x float> @llvm.arithmetic.fence.v2f32
+
+  v = __arithmetic_fence(a + b);
+  // CHECK: call fast float @llvm.arithmetic.fence.f32(float %add)
+  return 0;
+}
Index: clang/lib/Sema/SemaChecking.cpp
===================================================================
--- clang/lib/Sema/SemaChecking.cpp
+++ clang/lib/Sema/SemaChecking.cpp
@@ -1555,6 +1555,10 @@
     Diag(TheCall->getBeginLoc(), diag::warn_alloca)
         << TheCall->getDirectCallee();
     break;
+  case Builtin::BI__arithmetic_fence:
+    if (SemaBuiltinArithmeticFence(TheCall))
+      return ExprError();
+    break;
   case Builtin::BI__assume:
   case Builtin::BI__builtin_assume:
     if (SemaBuiltinAssume(TheCall))
@@ -6348,6 +6352,38 @@
   return false;
 }
 
+bool Sema::SemaBuiltinArithmeticFence(CallExpr *TheCall) {
+  Expr *Arg = TheCall->getArg(0);
+  if (Arg->isInstantiationDependent())
+    return false;
+
+#if 0
+  //QualType SrcTy = Arg->getType();
+  const auto getComponentTy = [] (const QualType Ty) -> QualType {
+    if (const VectorType *VT = dyn_cast<VectorType>(Ty))
+      return VT->getElementType().getTypePtr();
+    if (const ComplexType *CT = dyn_cast<ComplexType>(Ty))
+      return CT->getElementType().getTypePtr();
+    return Ty;
+  };
+#endif
+  const QualType ArgTy = TheCall->getArg(0)->getType();
+  bool IsFloating = [&]() {
+    if (const VectorType *VT = dyn_cast<VectorType>(ArgTy.getCanonicalType()))
+      return VT->getElementType().getTypePtr()->isFloatingType();
+    if (const ComplexType *CT = dyn_cast<ComplexType>(ArgTy.getCanonicalType()))
+      return CT->getElementType().getTypePtr()->isFloatingType();
+    return ArgTy->isFloatingType();
+  }();
+  if (!IsFloating)
+    return Diag(TheCall->getEndLoc(), diag::err_typecheck_expect_flt_or_vector)
+           << ArgTy;
+  if (checkArgCount(*this, TheCall, 1))
+    return true;
+  TheCall->setType(ArgTy);
+  return false;
+}
+
 /// SemaBuiltinAssume - Handle __assume (MS Extension).
 // __assume does not evaluate its arguments, and should warn if its argument
 // has side effects.
Index: clang/lib/CodeGen/CGBuiltin.cpp
===================================================================
--- clang/lib/CodeGen/CGBuiltin.cpp
+++ clang/lib/CodeGen/CGBuiltin.cpp
@@ -2833,6 +2833,17 @@
     Function *FnAssume = CGM.getIntrinsic(Intrinsic::assume);
     return RValue::get(Builder.CreateCall(FnAssume, ArgValue));
   }
+  case Builtin::BI__arithmetic_fence: {
+    Value *ArgValue = EmitScalarExpr(E->getArg(0));
+    QualType ArgType = E->getArg(0)->getType();
+    auto FMF = Builder.getFastMathFlags();
+    if (FMF.allowReassoc() && FMF.noNaNs() && FMF.noInfs() &&
+        FMF.noSignedZeros() && FMF.allowReciprocal() && FMF.allowContract() &&
+        FMF.approxFunc())
+      return RValue::get(
+          Builder.CreateArithmeticFence(ArgValue, ConvertType(ArgType)));
+    return RValue::get(ArgValue);
+  }
   case Builtin::BI__builtin_bswap16:
   case Builtin::BI__builtin_bswap32:
   case Builtin::BI__builtin_bswap64: {
Index: clang/lib/AST/ExprConstant.cpp
===================================================================
--- clang/lib/AST/ExprConstant.cpp
+++ clang/lib/AST/ExprConstant.cpp
@@ -9303,6 +9303,9 @@
     }
   }
 
+  case Builtin::BI__arithmetic_fence:
+    return false;
+
   default:
     break;
   }
Index: clang/include/clang/Sema/Sema.h
===================================================================
--- clang/include/clang/Sema/Sema.h
+++ clang/include/clang/Sema/Sema.h
@@ -12485,6 +12485,7 @@
 private:
   bool SemaBuiltinPrefetch(CallExpr *TheCall);
   bool SemaBuiltinAllocaWithAlign(CallExpr *TheCall);
+  bool SemaBuiltinArithmeticFence(CallExpr *TheCall);
   bool SemaBuiltinAssume(CallExpr *TheCall);
   bool SemaBuiltinAssumeAligned(CallExpr *TheCall);
   bool SemaBuiltinLongjmp(CallExpr *TheCall);
Index: clang/include/clang/Basic/DiagnosticSemaKinds.td
===================================================================
--- clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -8400,6 +8400,9 @@
   "operand of type %0 where arithmetic or pointer type is required">;
 def err_typecheck_cond_incompatible_operands : Error<
   "incompatible operand types%diff{ ($ and $)|}0,1">;
+def err_typecheck_expect_flt_or_vector : Error<
+  "operand of type %0 where floating, complex or "
+  "a vector of such types is required (%0 invalid)">;
 def err_cast_selector_expr : Error<
   "cannot type cast @selector expression">;
 def ext_typecheck_cond_incompatible_pointers : ExtWarn<
Index: clang/include/clang/Basic/Builtins.def
===================================================================
--- clang/include/clang/Basic/Builtins.def
+++ clang/include/clang/Basic/Builtins.def
@@ -1651,6 +1651,9 @@
 BUILTIN(__builtin_ms_va_end, "vc*&", "n")
 BUILTIN(__builtin_ms_va_copy, "vc*&c*&", "n")
 
+// Arithmetic Fence: to prevent FP reordering and reassociation optimizations
+LANGBUILTIN(__arithmetic_fence, "v.", "t", ALL_LANGUAGES)
+
 #undef BUILTIN
 #undef LIBBUILTIN
 #undef LANGBUILTIN
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to