Issue 123835
Summary [BOLT] `llvm-bolt` sporadic assertion failure in `BinaryBasicBlock::isCold()` during shrink wrapping with `-O3`
Labels BOLT
Assignees
Reporter ms178
    **Description:**

While trying to pinpoint another bug, I saw one assertion failure on a Clang-20git assertion build. `llvm-bolt` crashes with an assertion failure when optimizing a binary (dbus-broker in this case) with `-O3` in an assertion build of LLVM. This however didn't reproduce when using `-O2`. The crash occurs during the shrink-wrapping optimization pass.

**Environment:**

* **LLVM Version:**  20.0.0git (e4f03b158c97098e1835cc1f00d0175398974f98) plus [these BOLT-patches](https://github.com/ms178/archpkgbuilds/blob/main/toolchain-experimental/llvm-git/fixes.patch) on top.
*   **Operating System:** CachyOS
*   **Target Architecture:** Intel Raptor Lake
*   **Build Type:** Assertion build (compiled with assertions enabled)
*   **Reproducibility:** Not consistent. While it occured the first time, I can't reproduce it always.

```
export CC=clang
export CXX=clang++
export CC_LD=lld
export CXX_LD=lld
export AR=llvm-ar
export NM=llvm-nm
export STRIP=llvm-strip
export OBJCOPY=llvm-objcopy
export OBJDUMP=llvm-objdump
export READELF=llvm-readelf
export RANLIB=llvm-ranlib
export HOSTCC=clang
export HOSTCXX=clang++
export HOSTAR=llvm-ar
export CPPFLAGS="-D_FORTIFY_SOURCE=0"
export CFLAGS="-O3 -march=native -mtune=native -falign-functions=32 -fno-semantic-interposition -fcf-protection=none -mharden-sls=none -w"
export CXXFLAGS="${CFLAGS} -Wp,-U_GLIBCXX_ASSERTIONS"
export LDFLAGS="-Wl,-O3,-Bsymbolic-functions,--as-needed -fcf-protection=none -mharden-sls=none -Wl,--hash-style=gnu -Wl,--undefined-version"
export CCLDFLAGS="$LDFLAGS"
export CXXLDFLAGS="$LDFLAGS"
export ASFLAGS="-D__AVX__=1 -D__AVX2__=1 -D__FMA__=1"
```

**Steps to Reproduce:**

1.  Compile `dbus-broker` (with my optimized [PKGBUILD](https://github.com/ms178/archpkgbuilds/blob/main/packages/dbus-broker-git/PKGBUILD)) with `-O3` and the flags from above using an assertion build of Clang.
2. Generate PGO data (e.g., `perf.fdata`). The neccessary steps and flags are integrated in the PKGBUILD I provided.
3.  Run `llvm-bolt` on the linked binary with the following command line (adjust paths as needed):

```bash
llvm-bolt "$srcdir/build/src/dbus-broker" --data "$srcdir/bolt_profile/perf.fdata" --dyno-stats --lite=false --cu-processing-batch-size=64 --eliminate-unreachable --frame-opt=all --icf=all --jump-tables=aggressive --min-branch-clusters --stoke --sctc-mode=always --plt=all --hot-data --hugify --frame-opt-rm-stores --peepholes=all --infer-stale-profile=1 --x86-strip-redundant-address-size --indirect-call-promotion=all --reg-reassign --use-aggr-reg-reassign --reorder-blocks=ext-tsp --reorder-functions=cdsort --split-all-cold --split-eh --split-functions --split-strategy=cdsplit -o "$srcdir/build/bolt/dbus-broker.bolt"
```

**Expected Result:**

`llvm-bolt` should successfully optimize the binary without crashing.

**Actual Result:**

`llvm-bolt` crashes with the following assertion failure:

```
llvm-bolt: /home/marcus/toolchain/llvm/llvm-project/bolt/include/bolt/Core/BinaryBasicBlock.h:677: bool llvm::bolt::BinaryBasicBlock::isCold() const: Assertion `Fragment.get() < 2 && "Function is split into more than two (hot/cold)-fragments"' failed.
```

**Stack Trace:**

```
#0 0x00005d009882d785 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) Signals.cpp:0:0
#1 0x00005d009882db7c SignalHandler(int) Signals.cpp:0:0
#2 0x000078c0ae445f50 (/usr/lib/libc.so.6+0x45f50)
#3 0x000078c0ae4b57ed pthread_kill (/usr/lib/libc.so.6+0xb57ed)
#4 0x000078c0ae445e92 raise (/usr/lib/libc.so.6+0x45e92)
#5 0x000078c0ae4244a3 abort (/usr/lib/libc.so.6+0x244a3)
#6 0x000078c0ae4243d8 (/usr/lib/libc.so.6+0x243d8)
#7 0x000078c0ae43bde2 (/usr/lib/libc.so.6+0x3bde2)
#8 0x00005d0099199b65 llvm::bolt::ShrinkWrapping::perform(bool) (/home/marcus/toolchain/llvm/stage2-prof-use-lto/bin/llvm-bolt+0x4599b65)
#9 0x00005d00990d0dd2 std::_Function_handler<void (llvm::bolt::BinaryFunction&, unsigned short), llvm::bolt::FrameOptimizerPass::performShrinkWrapping(llvm::bolt::RegAnalysis const&, llvm::bolt::FrameAnalysis const&, llvm::bolt::BinaryContext&)::$_2>::_M_invoke(std::_Any_data const&, llvm::bolt::BinaryFunction&, unsigned short&&) FrameOptimizer.cpp:0:0
#10 0x00005d009928c3d5 llvm::bolt::ParallelUtilities::runOnEachFunctionWithUniqueAllocId(llvm::bolt::BinaryContext&, llvm::bolt::ParallelUtilities::SchedulingPolicy, std::function<void (llvm::bolt::BinaryFunction&, unsigned short)>, std::function<bool (llvm::bolt::BinaryFunction const&)>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, bool, unsigned int)::$_0::operator()(std::_Rb_tree_iterator<std::pair<unsigned long const, llvm::bolt::BinaryFunction>>, std::_Rb_tree_iterator<std::pair<unsigned long const, llvm::bolt::BinaryFunction>>, unsigned short) const ParallelUtilities.cpp:0:0
#11 0x00005d0098984258 std::_Function_handler<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> (), std::__future_base::_Task_setter<std::unique_ptr<std::__future_base::_Result<void>, std::__future_base::_Result_base::_Deleter>, std::thread::_Invoker<std::tuple<std::function<void ()>>>, void>>::_M_invoke(std::_Any_data const&) DWARFRewriter.cpp:0:0
#12 0x00005d00988bee01 std::__future_base::_State_baseV2::_M_do_set(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*) JITLinkLinker.cpp:0:0
#13 0x000078c0ae4b8ed7 (/usr/lib/libc.so.6+0xb8ed7)
#14 0x000078c0ae4b8f59 __pthread_once (/usr/lib/libc.so.6+0xb8f59)
#15 0x00005d0097798260 void std::call_once<void (std::__future_base::_State_baseV2::*)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*>(std::once_flag&, void (std::__future_base::_State_baseV2::*&&)(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*, bool*), std::__future_base::_State_baseV2*&&, std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>*&&, bool*&&) DWARFRewriter.cpp:0:0
#16 0x00005d00977981b1 std::__future_base::_State_baseV2::_M_set_result(std::function<std::unique_ptr<std::__future_base::_Result_base, std::__future_base::_Result_base::_Deleter> ()>, bool) DWARFRewriter.cpp:0:0
#17 0x00005d009779813a std::__future_base::_Deferred_state<std::thread::_Invoker<std::tuple<std::function<void ()>>>, void>::_M_complete_async() DWARFRewriter.cpp:0:0
#18 0x00005d009778fb0d std::__future_base::_State_baseV2::wait() DWARFRewriter.cpp:0:0
#19 0x00005d009928d4d7 llvm::StdThreadPool::processTasks(llvm::ThreadPoolTaskGroup*) ThreadPool.cpp:0:0
#20 0x00005d009928f5f4 void* llvm::thread::ThreadProxy<std::tuple<llvm::StdThreadPool::grow(int)::$_0>>(void*) ThreadPool.cpp:0:0
#21 0x000078c0ae4b37dd (/usr/lib/libc.so.6+0xb37dd)
#22 0x000078c0ae55f018 (/usr/lib/libc.so.6+0x15f018)
```

**Additional Information:**

*   The crash does not occur with `-O2` in an assertion build.
*   The crash does not occur every time with `-O3`.
*   The issue seems to be related to function splitting and shrink wrapping.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to