mibintc created this revision. mibintc added a reviewer: kpn. Herald added a subscriber: jfb. mibintc requested review of this revision. Herald added projects: clang, LLVM. Herald added a subscriber: llvm-commits.
This is a proposal to add a new clang builtin, __arithmetic_fence. The purpose of the builtin is to provide the user fine control, at the expression level, over floating point optimization when -ffast-math (-ffp-model=fast) is enabled. We are also proposing a new clang builtin that provides access to this intrinsic, as well as a new clang command line option `-fprotect-parens` that will be implemented using this intrinsic. This is the clang patch corresponding to https://reviews.llvm.org/D99675 which proposes a new llvm intrinsic llvm.arith.fence Note: Preliminary patch, not ready for code review, doesn't yet run end-to-end with the llvm patch, and the logic for the new option is not yet written. Rationale --------- Some expression transformations that are mathematically correct, such as reassociation and distribution, may be incorrect when dealing with finite precision floating point. For example, these two expressions, (a + b) + c a + (b + c) are equivalent mathematically in integer arithmetic, but not in floating point. In some floating point (FP) models, the compiler is allowed to make these value-unsafe transformations for performance reasons, even when the programmer uses parentheses explicitly. But the compiler must always honor the parentheses implied by __arithmetic_fence, regardless of the FP model settings. Under `–ffp-model=fast`, __arithmetic_fence provides a way to partially enforce ordering in an FP expression. | Original expression | Transformed expression | Permitted? | | ----------------------------- | ---------------------- | ---------- | | (a + b) + c | a + (b + c) | Yes! | | __arithmetic_fence(a + b) + c | a + (b + c) | No! | | NOTE: The __arithmetic_fence serves no purpose in value-safe FP modes like `–ffp-model=precise`: FP expressions are already strictly ordered. Motivation: ----------- - The new clang builtin provides clang compatibility with the Intel C++ compiler builtin `__fence` which has similar semantics, and likewise enables implementation of the option `-fprotect-parens`. - The new builtin provides the clang programmer control over floating point optimizations at the expression level. - The gnu fortran compiler, gfortran, likewise supports `-fprotect-parens` Requirements for __arithmetic_fence: ------------------------------------ - There is one operand. The operand type must be scalar floating point, complex floating point or vector thereof. - The return type is the same as the operand type. - The return value is equivalent to the operand. - The invocation of __arithmetic_fence is not a C/C++ constant expression, even if the operands are constant. New option `-fprotect-parens` ----------------------------- - Determines whether the optimizer honors parentheses when floating-point expressions are evaluated - Option defaults to `fno-protect-parens` Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D100118 Files: clang/include/clang/Basic/Builtins.def clang/include/clang/Basic/DiagnosticSemaKinds.td clang/include/clang/Sema/Sema.h clang/lib/AST/ExprConstant.cpp clang/lib/CodeGen/CGBuiltin.cpp clang/lib/Sema/SemaChecking.cpp clang/test/CodeGen/arithmetic-fence-builtin.c clang/test/Sema/arithmetic-fence-builtin.c llvm/include/llvm/IR/IRBuilder.h
Index: llvm/include/llvm/IR/IRBuilder.h =================================================================== --- llvm/include/llvm/IR/IRBuilder.h +++ llvm/include/llvm/IR/IRBuilder.h @@ -897,6 +897,13 @@ return CreateBinaryIntrinsic(Intrinsic::maximum, LHS, RHS, nullptr, Name); } + /// Create a call to the arithmetic_fence intrinsic. + CallInst *CreateArithmeticFence(Value *Val, Type *DstType, + const Twine &Name = "") { + return CreateIntrinsic(Intrinsic::arithmetic_fence, {DstType}, {Val}, nullptr, + Name); + } + /// Create a call to the experimental.vector.extract intrinsic. CallInst *CreateExtractVector(Type *DstType, Value *SrcVec, Value *Idx, const Twine &Name = "") { Index: clang/test/Sema/arithmetic-fence-builtin.c =================================================================== --- /dev/null +++ clang/test/Sema/arithmetic-fence-builtin.c @@ -0,0 +1,33 @@ +// RUN: %clang_cc1 -triple i386-pc-linux-gnu -emit-llvm -o - -verify -x c++ %s +int v; +template <typename T> T addT(T a, T b) { + T *q = __arithmetic_fence(&a); + // expected-error@-1 {{operand of type 'float *' where floating, complex or a vector of such types is required ('float *' invalid)}} + // expected-error@-2 {{operand of type 'int *' where floating, complex or a vector of such types is required ('int *' invalid)}} + return __arithmetic_fence(a + b); + // expected-error@-1 {{operand of type 'int' where floating, complex or a vector of such types is required ('int' invalid)}} +} +int addit(int a, int b) { + float x, y; + typedef struct { + int a, b; + } stype; + stype s; + s = __arithmetic_fence(s); // expected-error {{operand of type 'stype' where floating, complex or a vector of such types is required ('stype' invalid)}} + x = __arithmetic_fence(x, y); // expected-error {{too many arguments to function call, expected 1, have 2}} + // Complex is supported. + _Complex double cd, cd1; + cd = __arithmetic_fence(cd1); + // Vector is supported. + typedef float __v4hi __attribute__((__vector_size__(8))); + __v4hi vec1, vec2; + vec1 = __arithmetic_fence(vec2); + + v = __arithmetic_fence(a + b); // expected-error {{operand of type 'int' where floating, complex or a vector of such types is required ('int' invalid)}} + float f = addT<float>(a, b); // expected-note {{in instantiation of function template specialization 'addT<float>' requested here}} + int i = addT<int>(1, 2); // expected-note {{in instantiation of function template specialization 'addT<int>' requested here}} + constexpr float d = 1.0 + 2.0; + constexpr float c = __arithmetic_fence(1.0 + 2.0); + // expected-error@-1 {{constexpr variable 'c' must be initialized by a constant expression}} + return 0; +} Index: clang/test/CodeGen/arithmetic-fence-builtin.c =================================================================== --- /dev/null +++ clang/test/CodeGen/arithmetic-fence-builtin.c @@ -0,0 +1,35 @@ +// Test with fast math +// RUN: %clang_cc1 -triple i386-pc-linux-gnu -emit-llvm \ +// RUN: -menable-no-infs -menable-no-nans -menable-unsafe-fp-math \ +// RUN: -fno-signed-zeros -mreassociate -freciprocal-math \ +// RUN: -ffp-contract=fast -fno-rounding-math -ffast-math -ffinite-math-only \ +// RUN: -o - %s | FileCheck %s +// +// Test with fast math, showing incomplete implementaton for Complex +// this test fails. +// RUN: %clang_cc1 -triple i386-pc-linux-gnu -emit-llvm \ +// RUN: -menable-no-infs -menable-no-nans -menable-unsafe-fp-math \ +// RUN: -fno-signed-zeros -mreassociate -freciprocal-math \ +// RUN: -ffp-contract=fast -fno-rounding-math -ffast-math -ffinite-math-only \ +// RUN: -o - -DSHOWBUG %s | FileCheck %s +// +// TBD: Add test without fast flags showing llvm intrinsic not created +int v; +int addit(float a, float b) { +//CHECK: define {{.*}} @addit(float %a, float %b) #0 { +#ifdef SHOWBUG + // Assertion fail in clang when try to Emit complex expression + // Complex should be supported. + _Complex double cd, cd1; + cd = __arithmetic_fence(cd1); +#endif + // Vector should be supported. + typedef float __v2f32 __attribute__((__vector_size__(8))); + __v2f32 vec1, vec2; + vec1 = __arithmetic_fence(vec2); + // CHECK: call fast <2 x float> @llvm.arithmetic.fence.v2f32 + + v = __arithmetic_fence(a + b); + // CHECK: call fast float @llvm.arithmetic.fence.f32(float %add) + return 0; +} Index: clang/lib/Sema/SemaChecking.cpp =================================================================== --- clang/lib/Sema/SemaChecking.cpp +++ clang/lib/Sema/SemaChecking.cpp @@ -1555,6 +1555,10 @@ Diag(TheCall->getBeginLoc(), diag::warn_alloca) << TheCall->getDirectCallee(); break; + case Builtin::BI__arithmetic_fence: + if (SemaBuiltinArithmeticFence(TheCall)) + return ExprError(); + break; case Builtin::BI__assume: case Builtin::BI__builtin_assume: if (SemaBuiltinAssume(TheCall)) @@ -6348,6 +6352,38 @@ return false; } +bool Sema::SemaBuiltinArithmeticFence(CallExpr *TheCall) { + Expr *Arg = TheCall->getArg(0); + if (Arg->isInstantiationDependent()) + return false; + +#if 0 + //QualType SrcTy = Arg->getType(); + const auto getComponentTy = [] (const QualType Ty) -> QualType { + if (const VectorType *VT = dyn_cast<VectorType>(Ty)) + return VT->getElementType().getTypePtr(); + if (const ComplexType *CT = dyn_cast<ComplexType>(Ty)) + return CT->getElementType().getTypePtr(); + return Ty; + }; +#endif + const QualType ArgTy = TheCall->getArg(0)->getType(); + bool IsFloating = [&]() { + if (const VectorType *VT = dyn_cast<VectorType>(ArgTy.getCanonicalType())) + return VT->getElementType().getTypePtr()->isFloatingType(); + if (const ComplexType *CT = dyn_cast<ComplexType>(ArgTy.getCanonicalType())) + return CT->getElementType().getTypePtr()->isFloatingType(); + return ArgTy->isFloatingType(); + }(); + if (!IsFloating) + return Diag(TheCall->getEndLoc(), diag::err_typecheck_expect_flt_or_vector) + << ArgTy; + if (checkArgCount(*this, TheCall, 1)) + return true; + TheCall->setType(ArgTy); + return false; +} + /// SemaBuiltinAssume - Handle __assume (MS Extension). // __assume does not evaluate its arguments, and should warn if its argument // has side effects. Index: clang/lib/CodeGen/CGBuiltin.cpp =================================================================== --- clang/lib/CodeGen/CGBuiltin.cpp +++ clang/lib/CodeGen/CGBuiltin.cpp @@ -2833,6 +2833,17 @@ Function *FnAssume = CGM.getIntrinsic(Intrinsic::assume); return RValue::get(Builder.CreateCall(FnAssume, ArgValue)); } + case Builtin::BI__arithmetic_fence: { + Value *ArgValue = EmitScalarExpr(E->getArg(0)); + QualType ArgType = E->getArg(0)->getType(); + auto FMF = Builder.getFastMathFlags(); + if (FMF.allowReassoc() && FMF.noNaNs() && FMF.noInfs() && + FMF.noSignedZeros() && FMF.allowReciprocal() && FMF.allowContract() && + FMF.approxFunc()) + return RValue::get( + Builder.CreateArithmeticFence(ArgValue, ConvertType(ArgType))); + return RValue::get(ArgValue); + } case Builtin::BI__builtin_bswap16: case Builtin::BI__builtin_bswap32: case Builtin::BI__builtin_bswap64: { Index: clang/lib/AST/ExprConstant.cpp =================================================================== --- clang/lib/AST/ExprConstant.cpp +++ clang/lib/AST/ExprConstant.cpp @@ -9303,6 +9303,9 @@ } } + case Builtin::BI__arithmetic_fence: + return false; + default: break; } Index: clang/include/clang/Sema/Sema.h =================================================================== --- clang/include/clang/Sema/Sema.h +++ clang/include/clang/Sema/Sema.h @@ -12485,6 +12485,7 @@ private: bool SemaBuiltinPrefetch(CallExpr *TheCall); bool SemaBuiltinAllocaWithAlign(CallExpr *TheCall); + bool SemaBuiltinArithmeticFence(CallExpr *TheCall); bool SemaBuiltinAssume(CallExpr *TheCall); bool SemaBuiltinAssumeAligned(CallExpr *TheCall); bool SemaBuiltinLongjmp(CallExpr *TheCall); Index: clang/include/clang/Basic/DiagnosticSemaKinds.td =================================================================== --- clang/include/clang/Basic/DiagnosticSemaKinds.td +++ clang/include/clang/Basic/DiagnosticSemaKinds.td @@ -8400,6 +8400,9 @@ "operand of type %0 where arithmetic or pointer type is required">; def err_typecheck_cond_incompatible_operands : Error< "incompatible operand types%diff{ ($ and $)|}0,1">; +def err_typecheck_expect_flt_or_vector : Error< + "operand of type %0 where floating, complex or " + "a vector of such types is required (%0 invalid)">; def err_cast_selector_expr : Error< "cannot type cast @selector expression">; def ext_typecheck_cond_incompatible_pointers : ExtWarn< Index: clang/include/clang/Basic/Builtins.def =================================================================== --- clang/include/clang/Basic/Builtins.def +++ clang/include/clang/Basic/Builtins.def @@ -1651,6 +1651,9 @@ BUILTIN(__builtin_ms_va_end, "vc*&", "n") BUILTIN(__builtin_ms_va_copy, "vc*&c*&", "n") +// Arithmetic Fence: to prevent FP reordering and reassociation optimizations +LANGBUILTIN(__arithmetic_fence, "v.", "t", ALL_LANGUAGES) + #undef BUILTIN #undef LIBBUILTIN #undef LANGBUILTIN
_______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits