=?utf-8?q?Félix?= Cloutier <[email protected]>, =?utf-8?q?Félix?= Cloutier <[email protected]>, =?utf-8?q?Félix?= Cloutier <[email protected]> Message-ID: In-Reply-To: <llvm.org/llvm/llvm-project/pull/[email protected]>
https://github.com/apple-fcloutier updated https://github.com/llvm/llvm-project/pull/160790 >From e6d5807c8431163ed10dd739d6f105e3c22c756d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix=20Cloutier?= <[email protected]> Date: Mon, 18 Nov 2024 18:23:48 -0800 Subject: [PATCH 1/4] [clang] Implement -fstrict-bool ``bool`` values are stored as i8 in memory, and it is undefined behavior for a ``bool`` value to be any value other than 0 or 1. Clang exploits this with range metadata: ``bool`` load instructions at any optimization level above -O0 are assumed to only have their lowest bit set. This can create memory safety problems when other bits are set, for instance through ``memcpy``. This change allows users to configure this behavior in three ways: * ``-fstrict-bool`` represents the status quo; range metadata is added at levels above -O0 and allows the compiler to assume in-memory ``bool`` values are always either 0 or 1. * ``-fno-strict-bool[={truncate|nonzero}]`` disables range metadata on ``bool`` loaded values and offers two ways to interpret the loaded values. ``truncate`` means the value is true is the least significant bit is 1 and false otherwise; ``nonzero`` means the value is true if any bit is 1 and false otherwise. The default is ``-fstrict-bool`` to not change the current behavior of Clang. The default behavior of ``-fno-strict-bool`` is ``truncate``. Radar-ID: 139397212 --- clang/docs/ReleaseNotes.rst | 1 + clang/docs/UsersManual.rst | 18 +++++++++++++++ clang/include/clang/Basic/CodeGenOptions.def | 1 + clang/include/clang/Basic/CodeGenOptions.h | 10 ++++++++ clang/include/clang/Options/Options.td | 21 +++++++++++++++++ clang/lib/CodeGen/CGExpr.cpp | 24 +++++++++++++++----- clang/lib/Driver/ToolChains/Clang.cpp | 22 ++++++++++++++++++ clang/test/CodeGen/strict-bool.c | 23 +++++++++++++++++++ clang/test/Driver/strict-bool.c | 22 ++++++++++++++++++ 9 files changed, 136 insertions(+), 6 deletions(-) create mode 100644 clang/test/CodeGen/strict-bool.c create mode 100644 clang/test/Driver/strict-bool.c diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst index c80e060f6b7d2..221adf8d86f55 100644 --- a/clang/docs/ReleaseNotes.rst +++ b/clang/docs/ReleaseNotes.rst @@ -328,6 +328,7 @@ New Compiler Flags - New option ``-fsanitize-debug-trap-reasons=`` added to control emitting trap reasons into the debug info when compiling with trapping UBSan (e.g. ``-fsanitize-trap=undefined``). - New options for enabling allocation token instrumentation: ``-fsanitize=alloc-token``, ``-falloc-token-max=``, ``-fsanitize-alloc-token-fast-abi``, ``-fsanitize-alloc-token-extended``. - The ``-resource-dir`` option is now displayed in the list of options shown by ``--help``. +- New option ``-f[no-]strict-bool`` added to control whether Clang can assume that ``bool`` values loaded from memory cannot have a bit pattern other than 0 or 1. Lanai Support ^^^^^^^^^^^^^^ diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst index d267eec9425b3..cc389bf283f71 100644 --- a/clang/docs/UsersManual.rst +++ b/clang/docs/UsersManual.rst @@ -2312,6 +2312,24 @@ are listed below. additional function arity information (for supported targets). See :doc:`ControlFlowIntegrity` for more details. +.. option:: -fstrict-bool + + ``bool`` values are stored to memory as 8-bit values on most targets. Under + ``-fstrict-bool``, it is undefined behavior for a ``bool`` value stored in + memory to have any other bit pattern than 0 or 1. This creates some + optimization opportunities for the compiler, but it enables memory + corruption if that assumption is violated, for instance if any other value + is ``memcpy``ed over a ``bool``. This is enabled by default. + +.. option:: -fno-strict-bool[={truncate|nonzero}] + + Disable optimizations based on the assumption that all ``bool`` values, + which are typically represented as 8-bit integers in memory, only ever + contain bit patterns 0 or 1. When =truncate is specified, a ``bool`` is + true if its least significant bit is set, and false otherwise. When =nonzero + is specified, a ``bool`` is true when any bit is set, and false otherwise. + The default is =truncate. + .. option:: -fstrict-vtable-pointers Enable optimizations based on the strict rules for overwriting polymorphic diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def index 52360b67b306c..9f13a44782a1a 100644 --- a/clang/include/clang/Basic/CodeGenOptions.def +++ b/clang/include/clang/Basic/CodeGenOptions.def @@ -317,6 +317,7 @@ CODEGENOPT(SoftFloat , 1, 0, Benign) ///< -soft-float. CODEGENOPT(SpeculativeLoadHardening, 1, 0, Benign) ///< Enable speculative load hardening. CODEGENOPT(FineGrainedBitfieldAccesses, 1, 0, Benign) ///< Enable fine-grained bitfield accesses. CODEGENOPT(StrictEnums , 1, 0, Benign) ///< Optimize based on strict enum definition. +ENUM_CODEGENOPT(LoadBoolFromMem, BoolFromMem, 2, BoolFromMem::Strict, Benign) ///> Optimize based on in-memory bool values being 0 or 1. CODEGENOPT(StrictVTablePointers, 1, 0, Benign) ///< Optimize based on the strict vtable pointers CODEGENOPT(TimePasses , 1, 0, Benign) ///< Set when -ftime-report, -ftime-report=, -ftime-report-json, or -stats-file-timers is enabled. CODEGENOPT(TimePassesPerRun , 1, 0, Benign) ///< Set when -ftime-report=per-pass-run is enabled. diff --git a/clang/include/clang/Basic/CodeGenOptions.h b/clang/include/clang/Basic/CodeGenOptions.h index 6c445253d518b..8d2937c2ed4cc 100644 --- a/clang/include/clang/Basic/CodeGenOptions.h +++ b/clang/include/clang/Basic/CodeGenOptions.h @@ -213,6 +213,16 @@ class CodeGenOptions : public CodeGenOptionsBase { ///< larger debug info than `Basic`. }; + enum BoolFromMem { + Strict, ///< In-memory bool values are assumed to be 0 or 1, and any other + ///< value is UB. + Truncate, ///< Convert in-memory bools to i1 by checking if the least + ///< significant bit is 1. + NonZero, ///< Convert in-memory bools to i1 by checking if any bit is set + ///< to 1. + NonStrictDefault = Truncate + }; + /// The code model to use (-mcmodel). std::string CodeModel; diff --git a/clang/include/clang/Options/Options.td b/clang/include/clang/Options/Options.td index 786acd6abbd21..ffb675265121a 100644 --- a/clang/include/clang/Options/Options.td +++ b/clang/include/clang/Options/Options.td @@ -4188,6 +4188,27 @@ def fno_debug_macro : Flag<["-"], "fno-debug-macro">, Group<f_Group>, def fstrict_aliasing : Flag<["-"], "fstrict-aliasing">, Group<f_Group>, Visibility<[ClangOption, CLOption, DXCOption]>, HelpText<"Enable optimizations based on strict aliasing rules">; +def fstrict_bool : Flag<["-"], "fstrict-bool">, Group<f_Group>, + Visibility<[ClangOption]>, + HelpText<"Enable optimizations based on bool bit patterns never being " + "anything other than 0 or 1">; +def fno_strict_bool : Flag<["-"], "fno-strict-bool">, Group<f_Group>, + Visibility<[ClangOption]>, + HelpText<"Disable optimizations based on bool bit patterns never being " + "anything other than 0 or 1">; +def fno_strict_bool_EQ : Joined<["-"], "fno-strict-bool=">, Group<f_Group>, + Visibility<[ClangOption]>, + HelpText<"Disable optimizations based on bool bit patterns never being " + "anything other than 0 or 1, specifying a conversion behavior.">, + Values<"truncate,nonzero">; +def load_bool_from_mem : Joined<["-"], "load-bool-from-mem=">, Group<f_Group>, + Visibility<[CC1Option]>, + HelpText<"Specify how to convert a multi-bit bool loaded from memory to a " + "1-bit value">, + NormalizedValuesScope<"CodeGenOptions::BoolFromMem">, + Values<"strict,nonstrict,truncate,nonzero">, + NormalizedValues<["Strict", "NonStrictDefault", "Truncate", "NonZero"]>, + MarshallingInfoEnum<CodeGenOpts<"LoadBoolFromMem">, "Strict">; def fstrict_enums : Flag<["-"], "fstrict-enums">, Group<f_Group>, Visibility<[ClangOption, CC1Option]>, HelpText<"Enable optimizations based on the strict definition of an enum's " diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp index 712bec62f0a68..166daef0ae3ec 100644 --- a/clang/lib/CodeGen/CGExpr.cpp +++ b/clang/lib/CodeGen/CGExpr.cpp @@ -2018,9 +2018,9 @@ llvm::Value *CodeGenFunction::EmitLoadOfScalar(LValue lvalue, lvalue.getTBAAInfo(), lvalue.isNontemporal()); } -static bool getRangeForType(CodeGenFunction &CGF, QualType Ty, - llvm::APInt &Min, llvm::APInt &End, - bool StrictEnums, bool IsBool) { +static bool getRangeForType(CodeGenFunction &CGF, QualType Ty, llvm::APInt &Min, + llvm::APInt &End, bool StrictEnums, bool StrictBool, + bool IsBool) { const auto *ED = Ty->getAsEnumDecl(); bool IsRegularCPlusPlusEnum = CGF.getLangOpts().CPlusPlus && StrictEnums && ED && !ED->isFixed(); @@ -2028,6 +2028,8 @@ static bool getRangeForType(CodeGenFunction &CGF, QualType Ty, return false; if (IsBool) { + if (!StrictBool) + return false; Min = llvm::APInt(CGF.getContext().getTypeSize(Ty), 0); End = llvm::APInt(CGF.getContext().getTypeSize(Ty), 2); } else { @@ -2038,7 +2040,10 @@ static bool getRangeForType(CodeGenFunction &CGF, QualType Ty, llvm::MDNode *CodeGenFunction::getRangeForLoadFromType(QualType Ty) { llvm::APInt Min, End; + bool IsStrictBool = + CGM.getCodeGenOpts().getLoadBoolFromMem() == CodeGenOptions::Strict; if (!getRangeForType(*this, Ty, Min, End, CGM.getCodeGenOpts().StrictEnums, + IsStrictBool, Ty->hasBooleanRepresentation() && !Ty->isVectorType())) return nullptr; @@ -2086,7 +2091,8 @@ bool CodeGenFunction::EmitScalarRangeCheck(llvm::Value *Value, QualType Ty, return false; llvm::APInt Min, End; - if (!getRangeForType(*this, Ty, Min, End, /*StrictEnums=*/true, IsBool)) + if (!getRangeForType(*this, Ty, Min, End, /*StrictEnums=*/true, + /*StrictBool=*/true, IsBool)) return true; SanitizerKind::SanitizerOrdinal Kind = @@ -2236,8 +2242,14 @@ llvm::Value *CodeGenFunction::EmitFromMemory(llvm::Value *Value, QualType Ty) { llvm::Type *ResTy = ConvertType(Ty); if (Ty->hasBooleanRepresentation() || Ty->isBitIntType() || - Ty->isExtVectorBoolType()) - return Builder.CreateTrunc(Value, ResTy, "loadedv"); + Ty->isExtVectorBoolType()) { + if (CGM.getCodeGenOpts().getLoadBoolFromMem() == CodeGenOptions::NonZero) { + auto *NonZero = Builder.CreateICmpNE( + Value, llvm::Constant::getNullValue(Value->getType()), "loadedv.nz"); + return Builder.CreateIntCast(NonZero, ResTy, false, "loadedv"); + } else + return Builder.CreateTrunc(Value, ResTy, "loadedv"); + } return Value; } diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index c5d40c9825fab..ab521f7ccde7a 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -5739,6 +5739,28 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA, if (!Args.hasFlag(options::OPT_fstruct_path_tbaa, options::OPT_fno_struct_path_tbaa, true)) CmdArgs.push_back("-no-struct-path-tbaa"); + + if (Arg *A = Args.getLastArg(options::OPT_fstrict_bool, + options::OPT_fno_strict_bool, + options::OPT_fno_strict_bool_EQ)) { + StringRef BFM = ""; + if (A->getOption().matches(options::OPT_fstrict_bool)) + BFM = "strict"; + else if (A->getOption().matches(options::OPT_fno_strict_bool)) + BFM = "nonstrict"; + else if (A->getValue() == StringRef("truncate")) + BFM = "truncate"; + else if (A->getValue() == StringRef("nonzero")) + BFM = "nonzero"; + else + D.Diag(diag::err_drv_invalid_value) + << A->getAsString(Args) << A->getValue(); + CmdArgs.push_back(Args.MakeArgString("-load-bool-from-mem=" + BFM)); + } else if (KernelOrKext) { + // If unspecified, assume -fno-strict-bool=truncate in the Darwin kernel. + CmdArgs.push_back("-load-bool-from-mem=truncate"); + } + Args.addOptInFlag(CmdArgs, options::OPT_fstrict_enums, options::OPT_fno_strict_enums); Args.addOptOutFlag(CmdArgs, options::OPT_fstrict_return, diff --git a/clang/test/CodeGen/strict-bool.c b/clang/test/CodeGen/strict-bool.c new file mode 100644 index 0000000000000..f95cade48d7aa --- /dev/null +++ b/clang/test/CodeGen/strict-bool.c @@ -0,0 +1,23 @@ +// RUN: %clang_cc1 -triple armv7-apple-darwin -O1 -emit-llvm -o - %s | FileCheck %s -check-prefix=CHECK-STRICT +// RUN: %clang_cc1 -triple armv7-apple-darwin -O1 -load-bool-from-mem=strict -emit-llvm -o - %s | FileCheck %s -check-prefix=CHECK-STRICT +// RUN: %clang_cc1 -triple armv7-apple-darwin -O1 -load-bool-from-mem=nonstrict -emit-llvm -o - %s | FileCheck %s -check-prefix=CHECK-TRUNCATE +// RUN: %clang_cc1 -triple armv7-apple-darwin -O1 -load-bool-from-mem=truncate -emit-llvm -o - %s | FileCheck %s -check-prefix=CHECK-TRUNCATE +// RUN: %clang_cc1 -triple armv7-apple-darwin -O1 -load-bool-from-mem=nonzero -emit-llvm -o - %s | FileCheck %s -check-prefix=CHECK-NONZERO + +struct has_bool { + _Bool b; +}; + +int foo(struct has_bool *b) { + // CHECK-STRICT: load i8, {{.*}}, !range ![[RANGE_BOOL:[0-9]+]] + // CHECK-STRICT-NOT: and i8 + + // CHECK-TRUNCATE: [[BOOL:%.+]] = load i8 + // CHECK-TRUNCATE: and i8 [[BOOL]], 1 + + // CHECK-NONZERO: [[BOOL:%.+]] = load i8 + // CHECK-NONZERO: cmp ne i8 [[BOOL]], 0 + return b->b; +} + +// CHECK_STRICT: ![[RANGE_BOOL]] = !{i8 0, i8 2} diff --git a/clang/test/Driver/strict-bool.c b/clang/test/Driver/strict-bool.c new file mode 100644 index 0000000000000..dc1a25872324b --- /dev/null +++ b/clang/test/Driver/strict-bool.c @@ -0,0 +1,22 @@ +// RUN: %clang -### %s 2>&1 | FileCheck %s --check-prefix=CHECK-NONE +// RUN: %clang -### -fstrict-bool %s 2>&1 | FileCheck %s --check-prefix=CHECK-STRICT +// RUN: %clang -### -fno-strict-bool %s 2>&1 | FileCheck %s --check-prefix=CHECK-NONSTRICT +// RUN: %clang -### -fno-strict-bool=truncate %s 2>&1 | FileCheck %s --check-prefix=CHECK-TRUNCATE +// RUN: %clang -### -fno-strict-bool=nonzero %s 2>&1 | FileCheck %s --check-prefix=CHECK-NONZERO +// RUN: %clang -### -fstrict-bool -fno-strict-bool %s 2>&1 | FileCheck %s --check-prefix=CHECK-NONSTRICT +// RUN: %clang -### -fno-strict-bool -fno-strict-bool=nonzero %s 2>&1 | FileCheck %s --check-prefix=CHECK-NONZERO +// RUN: %clang -### -fno-strict-bool=nonzero -fstrict-bool %s 2>&1 | FileCheck %s --check-prefix=CHECK-STRICT + +// RUN: %clang -### -mkernel %s 2>&1 | FileCheck %s --check-prefix=CHECK-TRUNCATE +// RUN: %clang -### -fapple-kext %s 2>&1 | FileCheck %s --check-prefix=CHECK-TRUNCATE +// RUN: %clang -### -mkernel -fstrict-bool %s 2>&1 | FileCheck %s --check-prefix=CHECK-STRICT +// RUN: %clang -### -fstrict-bool -mkernel %s 2>&1 | FileCheck %s --check-prefix=CHECK-STRICT + +// RUN: not %clang -### -fno-strict-bool=ow-ouch %s 2>&1 | FileCheck %s --check-prefix=CHECK-INVALID + +// CHECK-NONE-NOT: -load-bool-from-mem +// CHECK-STRICT: -load-bool-from-mem=strict +// CHECK-NONSTRICT: -load-bool-from-mem=nonstrict +// CHECK-TRUNCATE: -load-bool-from-mem=truncate +// CHECK-NONZERO: -load-bool-from-mem=nonzero +// CHECK-INVALID: invalid value 'ow-ouch' in '-fno-strict-bool=ow-ouch' >From 80abcc8ece69692e85a5ffb1b53afba05b274f03 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix=20Cloutier?= <[email protected]> Date: Fri, 26 Sep 2025 12:50:10 -0700 Subject: [PATCH 2/4] * CodeGenOptions::BoolFromMem is now an ``enum class``. * Fix ``CodeGenFunction::EmitFromMemory`` breaking BitInt with -fno-strict-bool=nonzero. Add a BitInt test invocation with -fno-strict-bool=nonzero. * Add a note above ``getRangeForType`` warning against extending its functionality without introducing -fstrict flags and sanitizer options. * Add UBSan variants of the strict-bool tests showing that UBSan wins. --- clang/include/clang/Basic/CodeGenOptions.h | 2 +- clang/lib/CodeGen/CGExpr.cpp | 31 ++++++++++++++-------- clang/test/CodeGen/ext-int.c | 1 + clang/test/CodeGen/strict-bool.c | 19 ++++++++++--- 4 files changed, 38 insertions(+), 15 deletions(-) diff --git a/clang/include/clang/Basic/CodeGenOptions.h b/clang/include/clang/Basic/CodeGenOptions.h index 8d2937c2ed4cc..1ad80e537d334 100644 --- a/clang/include/clang/Basic/CodeGenOptions.h +++ b/clang/include/clang/Basic/CodeGenOptions.h @@ -213,7 +213,7 @@ class CodeGenOptions : public CodeGenOptionsBase { ///< larger debug info than `Basic`. }; - enum BoolFromMem { + enum class BoolFromMem { Strict, ///< In-memory bool values are assumed to be 0 or 1, and any other ///< value is UB. Truncate, ///< Convert in-memory bools to i1 by checking if the least diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp index 166daef0ae3ec..31996d3d9feaf 100644 --- a/clang/lib/CodeGen/CGExpr.cpp +++ b/clang/lib/CodeGen/CGExpr.cpp @@ -2018,6 +2018,13 @@ llvm::Value *CodeGenFunction::EmitLoadOfScalar(LValue lvalue, lvalue.getTBAAInfo(), lvalue.isNontemporal()); } +// XXX: safety first! This method SHOULD NOT be extended to support additional +// types, like BitInt types, without an opt-in bool controlled by a +// CodeGenOptions setting (like -fstrict-bool) and a new UBSan check (like +// SanitizerKind::Bool) as breaking that assumption would lead to memory +// corruption. See link for examples of how having a bool that has a value +// different from 0 or 1 in memory can lead to memory corruption. +// https://discourse.llvm.org/t/defining-what-happens-when-a-bool-isn-t-0-or-1/86778 static bool getRangeForType(CodeGenFunction &CGF, QualType Ty, llvm::APInt &Min, llvm::APInt &End, bool StrictEnums, bool StrictBool, bool IsBool) { @@ -2040,8 +2047,8 @@ static bool getRangeForType(CodeGenFunction &CGF, QualType Ty, llvm::APInt &Min, llvm::MDNode *CodeGenFunction::getRangeForLoadFromType(QualType Ty) { llvm::APInt Min, End; - bool IsStrictBool = - CGM.getCodeGenOpts().getLoadBoolFromMem() == CodeGenOptions::Strict; + bool IsStrictBool = CGM.getCodeGenOpts().getLoadBoolFromMem() == + CodeGenOptions::BoolFromMem::Strict; if (!getRangeForType(*this, Ty, Min, End, CGM.getCodeGenOpts().StrictEnums, IsStrictBool, Ty->hasBooleanRepresentation() && !Ty->isVectorType())) @@ -2241,15 +2248,17 @@ llvm::Value *CodeGenFunction::EmitFromMemory(llvm::Value *Value, QualType Ty) { } llvm::Type *ResTy = ConvertType(Ty); - if (Ty->hasBooleanRepresentation() || Ty->isBitIntType() || - Ty->isExtVectorBoolType()) { - if (CGM.getCodeGenOpts().getLoadBoolFromMem() == CodeGenOptions::NonZero) { - auto *NonZero = Builder.CreateICmpNE( - Value, llvm::Constant::getNullValue(Value->getType()), "loadedv.nz"); - return Builder.CreateIntCast(NonZero, ResTy, false, "loadedv"); - } else - return Builder.CreateTrunc(Value, ResTy, "loadedv"); - } + bool IsBitInt = Ty->isBitIntType(); + bool HasBoolRep = Ty->hasBooleanRepresentation(); + if (HasBoolRep && !IsBitInt && + CGM.getCodeGenOpts().getLoadBoolFromMem() == + CodeGenOptions::BoolFromMem::NonZero) { + auto *NonZero = Builder.CreateICmpNE( + Value, llvm::Constant::getNullValue(Value->getType()), "loadedv.nz"); + return Builder.CreateIntCast(NonZero, ResTy, false, "loadedv"); + } + if (HasBoolRep || IsBitInt || Ty->isExtVectorBoolType()) + return Builder.CreateTrunc(Value, ResTy, "loadedv"); return Value; } diff --git a/clang/test/CodeGen/ext-int.c b/clang/test/CodeGen/ext-int.c index aebacd6f22ffc..157d1990c060e 100644 --- a/clang/test/CodeGen/ext-int.c +++ b/clang/test/CodeGen/ext-int.c @@ -1,4 +1,5 @@ // RUN: %clang_cc1 -std=c23 -triple x86_64-gnu-linux -O3 -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s --check-prefixes=CHECK,CHECK64,LIN64 +// RUN: %clang_cc1 -std=c23 -triple x86_64-gnu-linux -O3 -load-bool-from-mem=nonzero -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s --check-prefixes=CHECK,CHECK64,LIN64 // RUN: %clang_cc1 -std=c23 -triple x86_64-windows-pc -O3 -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s --check-prefixes=CHECK,CHECK64,WIN64 // RUN: %clang_cc1 -std=c23 -triple i386-gnu-linux -O3 -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s --check-prefixes=CHECK,LIN32 // RUN: %clang_cc1 -std=c23 -triple i386-windows-pc -O3 -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s --check-prefixes=CHECK,WIN32 diff --git a/clang/test/CodeGen/strict-bool.c b/clang/test/CodeGen/strict-bool.c index f95cade48d7aa..c5dbe2dd0d7df 100644 --- a/clang/test/CodeGen/strict-bool.c +++ b/clang/test/CodeGen/strict-bool.c @@ -3,20 +3,33 @@ // RUN: %clang_cc1 -triple armv7-apple-darwin -O1 -load-bool-from-mem=nonstrict -emit-llvm -o - %s | FileCheck %s -check-prefix=CHECK-TRUNCATE // RUN: %clang_cc1 -triple armv7-apple-darwin -O1 -load-bool-from-mem=truncate -emit-llvm -o - %s | FileCheck %s -check-prefix=CHECK-TRUNCATE // RUN: %clang_cc1 -triple armv7-apple-darwin -O1 -load-bool-from-mem=nonzero -emit-llvm -o - %s | FileCheck %s -check-prefix=CHECK-NONZERO +// RUN: %clang_cc1 -triple armv7-apple-darwin -O1 -fsanitize=bool -load-bool-from-mem=strict -emit-llvm -o - %s | FileCheck %s -check-prefix=CHECK-UBSAN-STRICT +// RUN: %clang_cc1 -triple armv7-apple-darwin -O1 -fsanitize=bool -load-bool-from-mem=truncate -emit-llvm -o - %s | FileCheck %s -check-prefix=CHECK-UBSAN-TRUNCATE struct has_bool { _Bool b; }; int foo(struct has_bool *b) { - // CHECK-STRICT: load i8, {{.*}}, !range ![[RANGE_BOOL:[0-9]+]] - // CHECK-STRICT-NOT: and i8 + // CHECK-STRICT: [[BOOL:%.+]] = load i8, ptr {{.+}}, !range ![[RANGE_BOOL:[0-9]+]] + // CHECK-STRICT-NOT: and i8 [[BOOL]], 1 + // CHECK-STRICT-NOT: icmp ne i8 [[BOOL]], 0 + // CHECK-TRUNCATE-NOT: !range // CHECK-TRUNCATE: [[BOOL:%.+]] = load i8 // CHECK-TRUNCATE: and i8 [[BOOL]], 1 + // CHECK-NONZERO-NOT: !range // CHECK-NONZERO: [[BOOL:%.+]] = load i8 - // CHECK-NONZERO: cmp ne i8 [[BOOL]], 0 + // CHECK-NONZERO: icmp ne i8 [[BOOL]], 0 + + // CHECK-UBSAN-STRICT-NOT: !range + // CHECK-UBSAN-STRICT: [[BOOL:%.+]] = load i8, ptr {{.+}} + // CHECK-UBSAN-STRICT: icmp ult i8 [[BOOL]], 2 + + // CHECK-UBSAN-TRUNCATE-NOT: !range + // CHECK-UBSAN-TRUNCATE: [[BOOL:%.+]] = load i8, ptr {{.+}} + // CHECK-UBSAN-TRUNCATE: icmp ult i8 [[BOOL]], 2 return b->b; } >From d6ac5b6b062ef9cfeec4e5f9076f78d313ab2107 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix=20Cloutier?= <[email protected]> Date: Mon, 29 Sep 2025 18:38:30 -0700 Subject: [PATCH 3/4] Review feedback --- clang/docs/UsersManual.rst | 26 +++++++++++++-------- clang/lib/CodeGen/CGExpr.cpp | 11 ++++----- clang/test/CodeGen/ext-int.c | 1 - clang/test/CodeGen/strict-bool.c | 40 ++++++++++++++++++++++++++------ 4 files changed, 53 insertions(+), 25 deletions(-) diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst index cc389bf283f71..5a0f8e65a67c4 100644 --- a/clang/docs/UsersManual.rst +++ b/clang/docs/UsersManual.rst @@ -2314,21 +2314,27 @@ are listed below. .. option:: -fstrict-bool - ``bool`` values are stored to memory as 8-bit values on most targets. Under - ``-fstrict-bool``, it is undefined behavior for a ``bool`` value stored in - memory to have any other bit pattern than 0 or 1. This creates some - optimization opportunities for the compiler, but it enables memory - corruption if that assumption is violated, for instance if any other value - is ``memcpy``ed over a ``bool``. This is enabled by default. + ``bool`` values are stored to memory as 8-bit values on most targets. C and + C++ specify that it is undefined behavior to put a value other than 0 or 1 + in the storage of a ``bool`` value, and with ``-fstrict-bool``, Clang + leverages this knowledge for optimization opportunities. When this + assumption is violated, for instance if invalid data is ``memcpy``ed over a + ``bool``, the optimized code can lead to memory corruption. + ``-fstrict-bool`` is enabled by default. .. option:: -fno-strict-bool[={truncate|nonzero}] Disable optimizations based on the assumption that all ``bool`` values, which are typically represented as 8-bit integers in memory, only ever - contain bit patterns 0 or 1. When =truncate is specified, a ``bool`` is - true if its least significant bit is set, and false otherwise. When =nonzero - is specified, a ``bool`` is true when any bit is set, and false otherwise. - The default is =truncate. + contain bit patterns 0 or 1. When ``=truncate`` is specified, a ``bool`` is + true if its least significant bit is set, and false otherwise. When + ``=nonzero`` is specified, a ``bool`` is true when any bit is set, and + false otherwise. The default is ``=truncate``, but this could change in + future releases. + + ``-fno-strict-bool`` does not permit Clang to store a value other than 0 or + 1 in a ``bool``: it is a safety net against programmer mistakes, such as + ``memcpy``ing invalid data over a ``bool``. .. option:: -fstrict-vtable-pointers diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp index 31996d3d9feaf..e5fa3268cecd3 100644 --- a/clang/lib/CodeGen/CGExpr.cpp +++ b/clang/lib/CodeGen/CGExpr.cpp @@ -2248,16 +2248,13 @@ llvm::Value *CodeGenFunction::EmitFromMemory(llvm::Value *Value, QualType Ty) { } llvm::Type *ResTy = ConvertType(Ty); - bool IsBitInt = Ty->isBitIntType(); bool HasBoolRep = Ty->hasBooleanRepresentation(); - if (HasBoolRep && !IsBitInt && - CGM.getCodeGenOpts().getLoadBoolFromMem() == + if (HasBoolRep && CGM.getCodeGenOpts().getLoadBoolFromMem() == CodeGenOptions::BoolFromMem::NonZero) { - auto *NonZero = Builder.CreateICmpNE( - Value, llvm::Constant::getNullValue(Value->getType()), "loadedv.nz"); - return Builder.CreateIntCast(NonZero, ResTy, false, "loadedv"); + return Builder.CreateICmpNE( + Value, llvm::Constant::getNullValue(Value->getType()), "loadedv"); } - if (HasBoolRep || IsBitInt || Ty->isExtVectorBoolType()) + if (HasBoolRep || Ty->isBitIntType() || Ty->isExtVectorBoolType()) return Builder.CreateTrunc(Value, ResTy, "loadedv"); return Value; diff --git a/clang/test/CodeGen/ext-int.c b/clang/test/CodeGen/ext-int.c index 157d1990c060e..aebacd6f22ffc 100644 --- a/clang/test/CodeGen/ext-int.c +++ b/clang/test/CodeGen/ext-int.c @@ -1,5 +1,4 @@ // RUN: %clang_cc1 -std=c23 -triple x86_64-gnu-linux -O3 -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s --check-prefixes=CHECK,CHECK64,LIN64 -// RUN: %clang_cc1 -std=c23 -triple x86_64-gnu-linux -O3 -load-bool-from-mem=nonzero -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s --check-prefixes=CHECK,CHECK64,LIN64 // RUN: %clang_cc1 -std=c23 -triple x86_64-windows-pc -O3 -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s --check-prefixes=CHECK,CHECK64,WIN64 // RUN: %clang_cc1 -std=c23 -triple i386-gnu-linux -O3 -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s --check-prefixes=CHECK,LIN32 // RUN: %clang_cc1 -std=c23 -triple i386-windows-pc -O3 -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s --check-prefixes=CHECK,WIN32 diff --git a/clang/test/CodeGen/strict-bool.c b/clang/test/CodeGen/strict-bool.c index c5dbe2dd0d7df..f8894345f46db 100644 --- a/clang/test/CodeGen/strict-bool.c +++ b/clang/test/CodeGen/strict-bool.c @@ -1,15 +1,17 @@ -// RUN: %clang_cc1 -triple armv7-apple-darwin -O1 -emit-llvm -o - %s | FileCheck %s -check-prefix=CHECK-STRICT -// RUN: %clang_cc1 -triple armv7-apple-darwin -O1 -load-bool-from-mem=strict -emit-llvm -o - %s | FileCheck %s -check-prefix=CHECK-STRICT -// RUN: %clang_cc1 -triple armv7-apple-darwin -O1 -load-bool-from-mem=nonstrict -emit-llvm -o - %s | FileCheck %s -check-prefix=CHECK-TRUNCATE -// RUN: %clang_cc1 -triple armv7-apple-darwin -O1 -load-bool-from-mem=truncate -emit-llvm -o - %s | FileCheck %s -check-prefix=CHECK-TRUNCATE -// RUN: %clang_cc1 -triple armv7-apple-darwin -O1 -load-bool-from-mem=nonzero -emit-llvm -o - %s | FileCheck %s -check-prefix=CHECK-NONZERO -// RUN: %clang_cc1 -triple armv7-apple-darwin -O1 -fsanitize=bool -load-bool-from-mem=strict -emit-llvm -o - %s | FileCheck %s -check-prefix=CHECK-UBSAN-STRICT -// RUN: %clang_cc1 -triple armv7-apple-darwin -O1 -fsanitize=bool -load-bool-from-mem=truncate -emit-llvm -o - %s | FileCheck %s -check-prefix=CHECK-UBSAN-TRUNCATE +// RUN: %clang_cc1 -triple armv7-apple-darwin -O1 -emit-llvm -o - %s | FileCheck %s -check-prefixes=CHECK,CHECK-STRICT +// RUN: %clang_cc1 -triple armv7-apple-darwin -O1 -load-bool-from-mem=strict -emit-llvm -o - %s | FileCheck %s -check-prefixes=CHECK,CHECK-STRICT +// RUN: %clang_cc1 -triple armv7-apple-darwin -O1 -load-bool-from-mem=nonstrict -emit-llvm -o - %s | FileCheck %s -check-prefixes=CHECK,CHECK-TRUNCATE +// RUN: %clang_cc1 -triple armv7-apple-darwin -O1 -load-bool-from-mem=truncate -emit-llvm -o - %s | FileCheck %s -check-prefixes=CHECK,CHECK-TRUNCATE +// RUN: %clang_cc1 -triple armv7-apple-darwin -O1 -load-bool-from-mem=nonzero -emit-llvm -o - %s | FileCheck %s -check-prefixes=CHECK,CHECK-NONZERO +// RUN: %clang_cc1 -triple armv7-apple-darwin -O1 -fsanitize=bool -load-bool-from-mem=strict -emit-llvm -o - %s | FileCheck %s -check-prefixes=CHECK,CHECK-UBSAN-STRICT +// RUN: %clang_cc1 -triple armv7-apple-darwin -O1 -fsanitize=bool -load-bool-from-mem=truncate -emit-llvm -o - %s | FileCheck %s -check-prefixes=CHECK,CHECK-UBSAN-TRUNCATE struct has_bool { _Bool b; + unsigned _BitInt(1) c; }; +// CHECK: @foo int foo(struct has_bool *b) { // CHECK-STRICT: [[BOOL:%.+]] = load i8, ptr {{.+}}, !range ![[RANGE_BOOL:[0-9]+]] // CHECK-STRICT-NOT: and i8 [[BOOL]], 1 @@ -33,4 +35,28 @@ int foo(struct has_bool *b) { return b->b; } +// CHECK: @bar +int bar(struct has_bool *c) { + // CHECK-STRICT: [[BITINT:%.+]] = load i8, ptr {{.+}}, !range ![[RANGE_BOOL:[0-9]+]] + // CHECK-STRICT-NOT: and i8 [[BITINT]], 1 + // CHECK-STRICT-NOT: icmp ne i8 [[BITINT]], 0 + + // CHECK-TRUNCATE-NOT: !range + // CHECK-TRUNCATE: [[BITINT:%.+]] = load i8 + // CHECK-TRUNCATE: and i8 [[BITINT]], 1 + + // CHECK-NONZERO-NOT: !range + // CHECK-NONZERO: [[BITINT:%.+]] = load i8 + // CHECK-NONZERO: icmp ne i8 [[BITINT]], 0 + + // CHECK-UBSAN-STRICT-NOT: !range + // CHECK-UBSAN-STRICT: [[BITINT:%.+]] = load i8, ptr {{.+}} + // CHECK-UBSAN-STRICT: icmp ult i8 [[BITINT]], 2 + + // CHECK-UBSAN-TRUNCATE-NOT: !range + // CHECK-UBSAN-TRUNCATE: [[BITINT:%.+]] = load i8, ptr {{.+}} + // CHECK-UBSAN-TRUNCATE: icmp ult i8 [[BITINT]], 2 + return c->c; +} + // CHECK_STRICT: ![[RANGE_BOOL]] = !{i8 0, i8 2} >From aef54c7171ac67bf15e5f8487d2a2ef4afd720e1 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?F=C3=A9lix=20Cloutier?= <[email protected]> Date: Mon, 29 Sep 2025 18:40:17 -0700 Subject: [PATCH 4/4] (clang-format) --- clang/lib/CodeGen/CGExpr.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp index e5fa3268cecd3..4703cae9e6e03 100644 --- a/clang/lib/CodeGen/CGExpr.cpp +++ b/clang/lib/CodeGen/CGExpr.cpp @@ -2250,7 +2250,7 @@ llvm::Value *CodeGenFunction::EmitFromMemory(llvm::Value *Value, QualType Ty) { llvm::Type *ResTy = ConvertType(Ty); bool HasBoolRep = Ty->hasBooleanRepresentation(); if (HasBoolRep && CGM.getCodeGenOpts().getLoadBoolFromMem() == - CodeGenOptions::BoolFromMem::NonZero) { + CodeGenOptions::BoolFromMem::NonZero) { return Builder.CreateICmpNE( Value, llvm::Constant::getNullValue(Value->getType()), "loadedv"); } _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
