[llvm-bugs] [Bug 129688] [Flang] position of `-L$LLVM_DIR/lib`
Issue 129688 Summary [Flang] position of `-L$LLVM_DIR/lib` Labels flang:driver Assignees Reporter kawashima-fj The Flang driver puts `-L$LLVM_DIR/lib -lflang_rt.runtime` after OS library directory `-L` options when linking. ```console $ flang -### test.f90 |& tail -1 | tr ' ' '\n' | grep '^"-[Ll]' "-L/home/foo/llvm/bin/../lib/aarch64-unknown-linux-gnu" "-L/usr/lib/gcc/aarch64-linux-gnu/13" "-L/lib/aarch64-linux-gnu" "-L/usr/lib/aarch64-linux-gnu" "-L/lib" "-L/usr/lib" "-L/home/foo/llvm/lib" "-lflang_rt.runtime" "-lm" "-lgcc" "-lgcc_s" "-lc" "-lgcc" "-lgcc_s" ``` If you install Flang (and Flang-RT) manually on an environment where the OS-provided Flang package is already installed, this has a problem. When user-installed Flang is invoked, `libflang_rt.runtime` in the user-installed directory should be linked. However, OS-provided `libflang_rt.runtime` in `/lib` (or `/lib/TRIPLE` or `/lib64`, depending on OS) is linked instead. https://github.com/llvm/llvm-project/issues/100403 and https://github.com/open-mpi/ompi/issues/13116 suffer from this problem. Should we prioritize `-L$LLVM_DIR/lib` over `-L/lib`? ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129723] Using std::hash through does not compile for builtin arithmetic types
Issue 129723 Summary Using std::hash through does not compile for builtin arithmetic types Labels new issue Assignees Reporter johnfranklinrickard As far as I am aware (hope I am not mistaken), this should be valid C++ code. ``` #include std::size_t test(int i) { return std::hash()(i); } ``` >From the C++ draft http://wg21.link/N4950: In `[unord.hash]` 2: > Each header that declares the template hash provides enabled specializations of hash for nullptr_t and all cv-unqualified arithmetic, enumeration, and pointer types And in `[type.index.synopsis]` it is mentioned that the header `` also declares the template `std::hash`. In my opinion the msvc STL has it correctly in this case and the code compiles correctly. Both libc++ and libstdc++ currently do not work for this code snippet and reject it with `implicit instantiation of undefined template 'std::hash'`. [Godbolt link](https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIAKwAbKSuADJ4DJgAcj4ARpjEIGakAA6oCoRODB7evnppGY4C4ZExLPGJXLaY9kUMQgRMxAQ5Pn5B1bVZDU0EJdFxCUm2jc2teVUKo30RA%2BVDXACUtqhexMjsHOYAzBHI3lgA1CbbbgQAnimYEViqJ9gmGgCCU%2BggIBkAXpgA%2BgSHBEwUwgEX%2BeEWxwA7BZDsRMAQ1gxDq93ggmAoECc3KD7hBFiDFicYSZIQARDjLWicfy8PwcLSkVCcNzWazI1brTDHMzbHikAiaCnLADWSQAHAA6SEATkkgWl20kGjFGn82zF%2Bk4klpgsZnF4ChAGn5guWcFgMEQKFQLBSdAS5EoaFt9sSADdkCkUj83VxpT8DICpj9VIFpDRaIDiIaILFdbEIk1zpw%2BQnmMRzgB5WLaTAOFO8Z1sQSZhi0ZP03hYWJeYBuMS0Q3cKuYFiGYDiSukfBwhx4N1A3WYVR5ryAgvkQQ1XW0PCxYhJjxYXUEYh4FgTgfEWLpTCk1vt2dGU18AzABQANTwmAA7pmrnS%2BfxBCIxOwpDJBIoVOou7oqgYx6mJY1j6HOhqQMsqApHUTa8KgW5rlgEF4p0eZ1C4DDuJ4bRJMkYSzGUFQgBqBSZAI4x%2BGYyRkXU/REUMGp2Oh3TTJReFoX2Ag9M09GDIkTFsTheTUSMvR8fMAnLAoHIbBIlLUjqXZMhwhyhpIhwsAoHqHL60oSoGQL/BAuCECQ3K8osvACpWiwiiAkiQhKYpcGYkjqoEYqymY/gflSHDaqQG7%2BMadIMipBpGiatmkOaVqrAQKRjo6EDOna9DEFErCbKoYqBAAtGGhzAMgyCHBAq5eAwwpWSE%2BBEEhejPsIojiB%2BzXfmour/qQN4LikBYKRwNKkGF8GcJmY5Jf8qBUGpRVaTpekGUwQbGR4LoZRZSzWaaywIJgTBYIkqH%2BYFwWhbqEW2FFNlaHZpCijyErbNs0rSpCPLStRkgqlw/iahw2xKeF%2BrRfdsWWnFEBIGlropXDGUgOezApBiqAEHwdBRjGcZdmmSYTgTGbZrm%2BbNqQRaMAQpblrq1a1vWtCNhOWBtkYnYMj2LEDk2DLDqO44U6C05drO86LhgmwMqu66bgkO5KPu7MdhEoAxVQZ6Xted4PhOzWvm14ayJ1v4MroySAcYrKWGBsQoVBMFZHBjKIXgyHwNJNQsc4ECuOxVQEaU/H5Ok5HZMJfhVLRWQScRkze1x9RCbkUecXUPEzMHkl6FMvQB2JvGESHSwrGscml/5I1jXqqm5QVRUlWVFXEFVNXlaZDXbbVd1Co9EiSBKQSqvKXBipCKqSG5wRnbwF2jVdYOGsavdmlDSAJdNCM2ulCRZWwnD14VGlN%2BVlXVbVmD1SQ7tNbIhvvsbX5KF1f7DH1TADc2Q3V4vHCTYlMchxZpqTysfYqpUz6twvuVDau9iAWTMD3PapADpHSGKdLUc8AiXWUkvW6KCnpmAlNKLgGhSFcD9JCN6bkNT%2BWBgvPBHBdoxWhuva0m0HQUFSjvV0KArZ/GgcKLGkYEi43jImDMRNJFZhzOhCcVMSxlgrFzTANY6wNibHyNmh5pZVjwL2RwvMhwjmQGOTYfIRb%2BQZOLBcGYlx6P5GuDcFMtyKz3AeDmasTyayYOeK8t57yMH1vfVqj9PzyBfmbHQIBtj6HbCgG2NhxYO0ZE7AQTZ8qvHFicUkKSCDoGAlYSw8E3Ye0gunLImFsKpz0EHOY8dUhhzqAXGOxRi45wTl0biKdcJdJ9sncSHTGl5zGJHXO0w44LGkrJd8P8QbjTrmAoq%2Bx2xQLbhCEy18EE7B2uDPuopJD%2BAlGPMw0px6BDcpQsUNzAbnRwYw0GzCborxPNDDeU1krcMRnvbKh9lkaVWUYdZF9eBXzMo1KoBswkDwiabbqsTer9UGoDX%2BTCAHTWAXNI%2BKyrYgvbhAOBroLLbGQbZfah1jqUCGvckKjzFmRVeeS/ubkSH%2BEkFPbYGgNA8lVJCDQcT6ELNriw%2B6Q0zDCuuqvZYW4MjOEkEAA%3D%3D%3D) I only checked it for ``, but this could affect more headers where `std::hash` is supposed to be defined as mentioned in [cppreference](https://en.cppreference.com/w/cpp/utility/hash). Under the assumption that this is correct C++, would it be possible to fix these includes for the LLVM standard library? ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129726] [Flang] Error due to order of specific procedures in generic interface
Issue 129726 Summary [Flang] Error due to order of specific procedures in generic interface Labels flang Assignees Reporter ivan-pi Given the following module, simply changing the order of the specific procedure in a type-bound method results in an error: ```fortran module collisions implicit none type :: Spaceship contains #ifdef WORKING procedure, private, pass(x) :: collide_x => collide_ss, collide_sa ! Works #else procedure, private, pass(x) :: collide_x => collide_sa, collide_ss ! Breaks #endif procedure, private, pass(y) :: collide_y => collide_as generic :: collide => collide_x, collide_y end type type :: Asteroid end type contains subroutine collide_as(x,y) class(Asteroid) :: x class(Spaceship) :: y print *, "a/s" end subroutine subroutine collide_sa(x,y) class(Spaceship) :: x class(Asteroid) :: y print *, "s/a" end subroutine subroutine collide_ss(x,y) class(Spaceship) :: x class(Spaceship) :: y print *, "s/s" end subroutine end module ``` ``` $ flang-new -c c4.F90 error: Semantic errors in c4.F90 ./c4.F90:12:16: error: Generic 'collide' may not have specific procedures 'collide_x' and 'collide_y' as their interfaces are not distinguishable generic :: collide => collide_x, collide_y ^^^ ./c4.F90:25:12: Procedure 'collide_x' of type 'spaceship' is bound to 'collide_sa' subroutine collide_sa(x,y) ^^ ./c4.F90:20:12: Procedure 'collide_y' of type 'spaceship' is bound to 'collide_as' subroutine collide_as(x,y) ^^ $ flang-new -c -DWORKING c4.F90 $ flang-new --version Homebrew flang-new version 19.1.4 Target: x86_64-apple-darwin23.6.0 Thread model: posix InstalledDir: /usr/local/Cellar/flang/19.1.4/libexec Configuration file: /usr/local/Cellar/flang/19.1.4/libexec/flang.cfg Configuration file: /usr/local/etc/clang/x86_64-apple-darwin23.cfg ``` ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129698] [LLVM] Kaleidoscope compilation fails for chapter 4
Issue 129698 Summary [LLVM] Kaleidoscope compilation fails for chapter 4 Labels new issue Assignees Reporter nots1dd Upon compiling Chapter 4 of Kaleidoscope this error pops up: ```bash untoy.cpp:646:35: error: no member named 'toPtr' in 'llvm::orc::ExecutorSymbolDef' 646 | double (*FP)() = ExprSymbol.toPtr(); |~~ ^ untoy.cpp:646:50: error: expected _expression_ 646 | double (*FP)() = ExprSymbol.toPtr(); | ^ untoy.cpp:646:55: error: expected _expression_ 646 | double (*FP)() = ExprSymbol.toPtr(); | ^ 3 errors generated. ``` The main issue lies in this code snippet: ```cpp // Search the JIT for the __anon_expr symbol. auto ExprSymbol = ExitOnErr(TheJIT->lookup("__anon_expr")); /* ExprSymbolDef does not have this method (API change)! */ double (*FP)() = ExprSymbol.toPtr(); fprintf(stderr, "Evaluated to %f\n", FP()); ``` The error persists in future chapters until Chapter 8 where the focus shifts on compiling to object code. This seems to be the issue of the ORC's API being outdated and I already have a possible solution on how to fix this. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129701] [ASAN] `new-delete-type-mismatch` with allocation bigger than the object
Issue 129701 Summary [ASAN] `new-delete-type-mismatch` with allocation bigger than the object Labels compiler-rt:asan, false-positive Assignees Reporter firewave This has been reduced from code in https://github.com/mamedev/mame/blob/master/src/osd/modules/file/posixfile.cpp. ```cpp #include struct entry { const char * name; }; static std::unique_ptr osd_stat() { entry *result = reinterpret_cast(::operator new(sizeof(*result) + 1)); return std::unique_ptr(result); } int main() { auto f = osd_stat(); } ``` https://godbolt.org/z/G8Kfz945c ``` ==1==ERROR: AddressSanitizer: new-delete-type-mismatch on 0x50200010 in thread T0: object passed to delete has wrong type: size of the allocated type: 9 bytes; size of the deallocated type: 8 bytes. #0 0x5b65dbf6a542 in operator delete(void*, unsigned long) /root/llvm-project/compiler-rt/lib/asan/asan_new_delete.cpp:155:3 #1 0x5b65dbf6c19b in std::default_delete::operator()(entry*) const /opt/compiler-explorer/gcc-14.2.0/lib/gcc/x86_64-linux-gnu/14.2.0/../../../../include/c++/14.2.0/bits/unique_ptr.h:93:2 #2 0x5b65dbf6bebf in std::unique_ptr>::~unique_ptr() /opt/compiler-explorer/gcc-14.2.0/lib/gcc/x86_64-linux-gnu/14.2.0/../../../../include/c++/14.2.0/bits/unique_ptr.h:398:4 #3 0x5b65dbf6bda3 in main /app/example.cpp:18:1 #4 0x7750ada29d8f (/lib/x86_64-linux-gnu/libc.so.6+0x29d8f) (BuildId: 490fef8403240c91833978d494d39e537409b92e) #5 0x7750ada29e3f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e3f) (BuildId: 490fef8403240c91833978d494d39e537409b92e) #6 0x5b65dbe8b354 in _start (/app/output.s+0x2c354) 0x50200010 is located 0 bytes inside of 9-byte region [0x50200010,0x50200019) allocated by thread T0 here: #0 0x5b65dbf698dd in operator new(unsigned long) /root/llvm-project/compiler-rt/lib/asan/asan_new_delete.cpp:86:3 #1 0x5b65dbf6be20 in osd_stat() /app/example.cpp:10:47 #2 0x5b65dbf6bd9a in main /app/example.cpp:17:14 #3 0x7750ada29d8f (/lib/x86_64-linux-gnu/libc.so.6+0x29d8f) (BuildId: 490fef8403240c91833978d494d39e537409b92e) SUMMARY: AddressSanitizer: new-delete-type-mismatch /app/example.cpp:18:1 in main ``` ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129705] s390x: widening multiplication does not optimize
Issue 129705 Summary s390x: widening multiplication does not optimize Labels new issue Assignees Reporter folkertdev this LLVM https://godbolt.org/z/cx8adPc9f ```llvm define range(i32 0, -131070) <4 x i32> @manual_mule(<8 x i16> %a, <8 x i16> %b) unnamed_addr { start: %0 = shufflevector <8 x i16> %a, <8 x i16> poison, <4 x i32> %1 = zext <4 x i16> %0 to <4 x i32> %2 = shufflevector <8 x i16> %b, <8 x i16> poison, <4 x i32> %3 = zext <4 x i16> %2 to <4 x i32> %4 = mul nuw <4 x i32> %3, %1 ret <4 x i32> %4 } ``` does not optimize to the expected output of `vec_mule`, a single `vmleh` instruction. The same is true for the other multiplication flavors (low, high, odd). ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129693] [Clang] Build fails when forward declared static template function used in std::visit
Issue 129693 Summary [Clang] Build fails when forward declared static template function used in std::visit Labels clang Assignees Reporter deaklajos Code: ```cpp #include template static T funcT(const T t); int main() { std::variant v{1}; return std::visit([&](auto& a) -> int { return funcT<>(a); }, v); } template static T funcT(const T t) { return t; } ``` Output: ``` :4:10: warning: function 'funcT' has internal linkage but is not defined [-Wundefined-internal] 4 | static T funcT(const T t); | ^ :12:16: note: used here 12 | return funcT<>(a); |^ :4:10: warning: function 'funcT' has internal linkage but is not defined [-Wundefined-internal] 4 | static T funcT(const T t); | ^ :12:16: note: used here 12 | return funcT<>(a); |^ 2 warnings generated. ASM generation compiler returned: 0 :4:10: warning: function 'funcT' has internal linkage but is not defined [-Wundefined-internal] 4 | static T funcT(const T t); | ^ :12:16: note: used here 12 | return funcT<>(a); |^ :4:10: warning: function 'funcT' has internal linkage but is not defined [-Wundefined-internal] 4 | static T funcT(const T t); | ^ :12:16: note: used here 12 | return funcT<>(a); |^ 2 warnings generated. /opt/compiler-explorer/gcc-14.2.0/lib/gcc/x86_64-linux-gnu/14.2.0/../../../../x86_64-linux-gnu/bin/ld: /tmp/example-8f2a9c.o: in function `int main::$_0::operator()(int&) const': :12:(.text+0x257): undefined reference to `int funcT(int)' /opt/compiler-explorer/gcc-14.2.0/lib/gcc/x86_64-linux-gnu/14.2.0/../../../../x86_64-linux-gnu/bin/ld: /tmp/example-8f2a9c.o: in function `int main::$_0::operator()(double&) const': :12:(.text+0x309): undefined reference to `double funcT(double)' clang++: error: linker command failed with exit code 1 (use -v to see invocation) Execution build compiler returned: 1 ``` Demo: https://godbolt.org/z/d18dvc56E ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129707] error: 'vector.insert' op expected position attribute rank + source rank to match dest vector rank
Issue 129707 Summary error: 'vector.insert' op expected position attribute rank + source rank to match dest vector rank Labels new issue Assignees Reporter CXiaorong My goal is to insert two vector<32xf32> into vector<64xf32>, but the verification keeps reporting errors. I don't know what the specific reason is. I hope to get your reply. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129730] Value of `__cpp_constexpr` is incorrect for `-std=c++20`
Issue 129730 Summary Value of `__cpp_constexpr` is incorrect for `-std=c++20` Labels new issue Assignees Reporter elbeno When compiling with `-std=c++20`, clang defines `__cpp_constexpr` to be `201907L`, and it should be `202002L`. https://godbolt.org/z/G6v7EvEfe Clearly clang has support for https://wg21.link/P1330. So the value of `__cpp_constexpr` in C++20 should be 202002L as given by https://wg21.link/p2493 and https://isocpp.org/std/standing-documents/sd-6-sg10-feature-test-recommendations#__cpp_constexpr ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129764] `-fzero-call-used-regs` should not trigger before tail-calls
Issue 129764 Summary `-fzero-call-used-regs` should not trigger before tail-calls Labels new issue Assignees Reporter nelhage I believe that `-fzero-call-used-regs` should be modified to not clear registers prior to a tail call. Here's my reasoning: With the landing of `clang::musttail`, there's been a bit of a trend towards using indirect tail calls to implement efficient interpreters and parsers; see [the original post about protobuf][proto-tc], and [CPython's recent new interpreter][python-tc]. This pattern is, in part, an alternative to using computed gotos to implement dispatch within a single large interpreter function. In both cases (computed gotos, and indirect tail calls), the opcode/parser definition generates fairly similar code, ending with an indirect call through a dispatch table. Depending on compiler choices, this turns into (on x86) something like `jmpq *%REG` or `jmpq *(%REG1, %REG2, 8)` With `-fzero-call-used-regs` enabled, clang/LLVM currently emit call-used-clearing `xor`s prior to the indirect tail-call, but not prior to a computed goto, even one that produces near-identical machine code ([example on goldbolt](https://godbolt.org/z/dxh754E49), showing the stylized core of an interpreter loop). Such interpreter loops tend to be extreme hot spots. On CPython, I've measured the cost of `-fzero-call-used-regs=used-gpr` on **only** the opcode functions at about 2% on [the pyperformance suite](https://github.com/python/pyperformance/), when using the tail-call interpreter. It seems surprising and "unfair" to impose this cost on the tail-call style but not the computed goto style of interpreter, when, again, they emit very similar machine code containing similar indirect jumps (and potential JOP gadgets). Also, GCC's implementation behaves in the way I describe, eliding the clearing for tail calls. See a [godbolt example](https://godbolt.org/z/3KTYzWoWb) -- if you remove the `clang::musttail` and add `-fno-optimize-sibling-calls` to the GCC options, the `xor`s will reappear [proto-tc]: https://blog.reverberate.org/2021/04/21/musttail-efficient-interpreters.html [python-tc]: https://github.com/python/cpython/pull/128718 ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129740] Assertion `(!R2 || (Kind <= REX2 || Kind == EVEX)) && "invalid setting"
Issue 129740 Summary Assertion `(!R2 || (Kind <= REX2 || Kind == EVEX)) && "invalid setting" Labels new issue Assignees Reporter ashermancinelli ``` > clang++ -march=znver4 -v -O3 -c reduced.ll llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp:173: void {anonymous}::X86OpcodePrefixHelper::setR2(unsigned int): Assertion `(!R2 || (Kind <= REX2 || Kind == EVEX)) && "invalid setting"' failed. ``` ``` ;; reduced.ll ; ModuleID = '' target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" @extfloat1 = external global float @extfloat2 = external global [220 x [250 x float]] define void @foo(ptr %0, i32 %1, i64 %2, float %3, float %4, ptr %5, i64 %6, i1 %7, ptr %8) { %10 = alloca [0 x [0 x [0 x float]]], i32 0, align 4 %11 = alloca float, i64 %2, align 4 %12 = alloca float, i64 %2, align 4 call void @bar(ptr %10) br label %13 13: ; preds = %40, %9 br label %14 14: ; preds = %35, %13 %.027 = phi float [ 0.00e+00, %13 ], [ %.1, %35 ] %15 = phi i32 [ %1, %13 ], [ %19, %35 ] %16 = phi i64 [ %2, %13 ], [ %39, %35 ] %17 = icmp sgt i64 %16, 0 br i1 %17, label %18, label %40 18: ; preds = %14 %19 = add i32 %15, 1 %20 = sext i32 %15 to i64 %21 = getelementptr float, ptr %11, i64 %20 %22 = load float, ptr %21, align 4 %23 = sext i32 %19 to i64 %24 = getelementptr float, ptr %11, i64 %23 store float %22, ptr %24, align 4 call void @baz(ptr %24, ptr %10, ptr null) %25 = load float, ptr %10, align 4 %26 = getelementptr float, ptr %5, i64 %23 %27 = load float, ptr %26, align 4 %28 = getelementptr float, ptr null, i64 %23 store float 0.00e+00, ptr %28, align 4 br i1 %7, label %29, label %35 29: ; preds = %18 %30 = fadd float %3, %27 %31 = fmul float %25, %30 %32 = fdiv arcp float %31, %3 %33 = fmul float %32, %4 %34 = fadd reassoc float %.027, %33 br label %35 35: ; preds = %29, %18 %.1 = phi float [ %34, %29 ], [ %.027, %18 ] %36 = getelementptr float, ptr %8, i64 %23 %37 = getelementptr float, ptr %12, i64 %6 call void @qux(ptr %0, ptr %36, ptr %37) %38 = load float, ptr %12, align 4 store float %38, ptr %11, align 4 %39 = add i64 %16, -1 br label %14 40: ; preds = %14 %41 = fcmp ogt float %.027, 0.00e+00 br i1 %41, label %42, label %13 42: ; preds = %40 ret void } declare void @bar(ptr) define void @baz(ptr %0, ptr %1, ptr %extfloat2) { %3 = load float, ptr null, align 4 %4 = call float @llvm.trunc.f32(float %3) %5 = fsub float 0.00e+00, %4 %6 = load float, ptr %0, align 4 %7 = load float, ptr @extfloat1, align 4 %8 = fmul float %6, %7 %9 = fptosi float %6 to i32 %10 = add i32 %9, 1 %11 = load float, ptr @extfloat2, align 4 %12 = load float, ptr getelementptr (i8, ptr @extfloat2, i64 -4), align 4 %13 = sext i32 %10 to i64 %14 = getelementptr float, ptr %extfloat2, i64 %13 %15 = getelementptr i8, ptr %14, i64 -4 %16 = load float, ptr %15, align 4 %17 = fsub float %12, 1.00e+00 %18 = fmul float %17, %6 %19 = fmul float %6, %5 %20 = fadd float %18, %19 %21 = fadd float %12, %16 %22 = fsub float %11, %21 %23 = fadd float %22, 0.00e+00 %24 = fmul float %23, %8 %25 = fmul float %24, 0.00e+00 %26 = fadd float %20, %25 store float %26, ptr %1, align 4 ret void } define void @qux(ptr %0, ptr %1, ptr %2) { %4 = load float, ptr %1, align 4 %5 = load float, ptr %0, align 4 %6 = fdiv ninf arcp float %5, %4 %7 = fptosi float %6 to i32 %8 = add i32 %7, 1 %9 = sext i32 %8 to i64 %10 = getelementptr float, ptr null, i64 %9 %11 = getelementptr i8, ptr %10, i64 -4 %12 = load float, ptr %11, align 4 %13 = fneg float %12 %14 = fmul reassoc nsz float %5, %13 %15 = fdiv ninf arcp contract float %4, %14 %16 = call float @llvm.exp.f32(float %15) %17 = fmul float %16, 0.00e+00 store float %17, ptr %2, align 4 ret void } ; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none) declare float @llvm.trunc.f32(float) #0 ; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none) declare float @llvm.exp.f32(float) #0 attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) } ``` ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129778] Lambdas as non‐type template parameters cause link errors (.rodata._ZTAXtl3$_0EE defined in discarded section)
Issue 129778 Summary Lambdas as non‐type template parameters cause link errors (.rodata._ZTAXtl3$_0EE defined in discarded section) Labels new issue Assignees Reporter ryanofsky I am encountering a linker error with Clang 19.1.4 when using lambdas as non‐type template parameters across multiple translation units. The minimal example below compiles without issues using GCC and also _does not_ trigger the link error in Clang when optimizations (`-O`) are enabled or if I change the parameter from `const auto&` to `auto` (i.e., pass by value rather than by const reference). Error: ```c++ `.rodata._ZTAXtl3$_0EE' referenced in section `.text' of pass_b.o: defined in discarded section `.rodata._ZTAXtl3$_0EE[_ZTAXtl3$_0EE]' of pass_b.o ``` (`_ZTAXtl3$_0EE` demangles to `template parameter object for $_0{}`.) Command to reproduce: ```bash clang++ -std=c++20 pass_a.cpp pass_b.cpp ``` Example code: __pass.h__ ```c++ #ifndef PASS_H #define PASS_H void PassArg(const auto& arg) { } template void PassTemplate() { PassArg(object); } #endif ``` __pass_a.cpp__ ```c++ #include "pass.h" constexpr auto fn_a = []{}; void pass_a() { PassTemplate(); } int main(int, char**) { return 0; } ``` __pass_b.cpp__ ```c++ #include "pass.h" constexpr auto fn_b = []{}; void pass_b() { PassTemplate(); } ``` ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129783] [clang][BoundsSafety] Extend `-Wvla-potential-size-confusion` for struct fields and bounds annotations
Issue 129783 Summary [clang][BoundsSafety] Extend `-Wvla-potential-size-confusion` for struct fields and bounds annotations Labels clang:frontend, TBAA, clang:bounds-safety Assignees rapidsna Reporter rapidsna https://github.com/llvm/llvm-project/pull/129772 `-Wvla-potential-size-confusion` diagnoses when `n` references the file scope variable and not the parameter. ``` int n; void func(int array[n], int n); ``` We may want to extend it to diagnose on situations mentioned in the PR: - Diagnosing a similar situation in structures. e.g., ```C int n; struct S { int n; int array[sizeof(n)]; // Refers to outer n, not member n }; ``` - Diagnosing with constant-size arrays (requires tracking the _expression_ for the constant-size array in the `QualType`) e.g., ```C constexpr int n = 12; void func(int array[n], int n); ``` - Potentially, also diagnosing with any ambiguous situations with bounds annotations like below (with or without the `-fexperimental-late-parse-attributes` flag: ``` constexpr int n; struct foo { int * ptr __counted_by(n); int n; }; ``` ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129716] Suboptimal codegen for vptr load
Issue 129716 Summary Suboptimal codegen for vptr load Labels new issue Assignees Reporter apolukhin Consider the example: ```cpp #include struct Empty{}; template struct UnionOptional { UnionOptional() = default; const T* set() { return ::new (&data_.payload) T(); } void clear() { data_.payload.~T(); data_.e = {}; } union { Empty e{}; T payload; } data_; }; struct A { virtual void foo() const; }; struct B : A { void foo() const override; }; void sample_union() { UnionOptional value; value.set()->foo(); value.clear(); value.set()->foo(); } ``` With -O2 or -O3 clang-19 generates the following assembly: ``` sample_union(): pushr14 push rbx pushrax mov r14, qword ptr [rip + vtable for B@GOTPCREL] add r14, 16 mov qword ptr [rsp], r14 mov rbx, rsp mov rdi, rbx callB::foo() const@PLT mov qword ptr [rsp], r14 mov rdi, rbx callB::foo() const@PLT add rsp, 8 pop rbx pop r14 ret ``` However, a more optimal assembly with less instructions and register clobbering could be used: ``` sample_union(): sub rsp, 24 mov QWORD PTR [rsp+8], OFFSET FLAT:vtable for B+16 lea rdi, [rsp+8] callB::foo() const lea rdi, [rsp+8] mov QWORD PTR [rsp+8], OFFSET FLAT:vtable for B+16 callB::foo() const add rsp, 24 ret ``` Godbolt playground: https://godbolt.org/z/T5PzMfz1W ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129757] [DirectX] Re-evaluate pass ordering for producing correct DXIL module flags
Issue 129757 Summary [DirectX] Re-evaluate pass ordering for producing correct DXIL module flags Labels new issue Assignees Reporter Icohedron According to #120119, the DXIL Shader Flags pass needs to be executed before the DXIL Op Lowering pass in order to simplify its implementation by being able to work directly with DirectX target intrinsics. However, this dependency creates a challenge, as the shader flag analysis is based on instructions that may not exist after the lowering pass. This issue was discovered with the implementation of the Int64Ops Shader Flags Analysis and the resulting DXIL failing validation by `dxv` due to mismatched flags (https://github.com/llvm/llvm-project/pull/129089#issuecomment-2695570866). The Shader Flags Analysis currently enables the Int64Ops shader flag in the presence of `extractelement` instructions introduced by the Scalarizer pass. These `extractelement` instructions are subsequently be removed by the DXIL Op Lowering pass. Potential Solutions: 1. Perform Shader Flag Analysis before Scalarization: This would ensure that the `extractelement` instructions are not yet introduced, thereby avoiding the need to account for their removal later. But it may impact the implementation of current and/or future Shader Flag Analyses 2. Split the Shader Flag Analysis into two stages: one before the DXIL Op Lowering Pass and one after. This would also require moving the DXIL Translate Metadata pass to follow after the later Shader Flag Analysis. Shader Flag Analyses that benefit from the DirectX target intrinsics could be performed before DXIL Op Lowering, and the Shader Flag Analyses that don't benefit from that should be performed after DXIL Op Lowering. 3. Complicate the logic for the Int64Ops Shader Flag Analysis: Detect when an instruction return an i64 operand or using i64 operands will be removed by a subsequent DXIL Op Lowering pass. This would probably be very ugly. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129745] Several C++ EH tests fail with "terminating due to uncaught exception of type int"
Issue 129745 Summary Several C++ EH tests fail with "terminating due to uncaught exception of type int" Labels backend:Hexagon Assignees androm3da Reporter androm3da Some tests from the llvm-test-suite are failing like below. @quic-akaryaki investigated and found that using `eld` to build shared libraries instead of `lld` addressed this failure and it seems to be due to the absence of `PT_GNU_EH_FRAME` program header. ``` TEST 'test-suite :: SingleSource/Regression/C++/EH/Regression-C++-ctor_dtor_count.test' FAILED *** * /local/mnt/workspace/upstream/toolchain_for_hexagon/obj_test-suite_target-hexagon-v79-O2/tools/timeit --timeout 7200 --limit-core 0 --l imit-cpu 7200 --limit-file-size 209715200 --limit-rss-size 838860800 --append-exitstatus --redirect-output /local/mnt/workspace/upstrea m/toolchain_for_hexagon/obj_test-suite_target-hexagon-v79-O2/SingleSource/Regression/C++/EH/Output/Regression-C++-ctor_dtor_count.test. out --redirect-input /dev/null --chdir /local/mnt/workspace/upstream/toolchain_for_hexagon/obj_test-suite_target-hexagon-v79-O2/SingleS ource/Regression/C++/EH --summary /local/mnt/workspace/upstream/toolchain_for_hexagon/obj_test-suite_target-hexagon-v79-O2/SingleSource /Regression/C++/EH/Output/Regression-C++-ctor_dtor_count.test.time /local/mnt/workspace/upstream/toolchain_for_hexagon/clang+llvm-21.0. 0-cross-hexagon-unknown-linux-musl/x86_64-linux-gnu/bin/qemu_wrapper.sh /local/mnt/workspace/upstream/toolchain_for_hexagon/obj_test-su ite_target-hexagon-v79-O2/SingleSource/Regression/C++/EH/Regression-C++-ctor_dtor_count /local/mnt/workspace/upstream/toolchain_for_hexagon/obj_test-suite_target-hexagon-v79-O2/tools/fpcmp /local/mnt/workspace/upstream/tool chain_for_hexagon/obj_test-suite_target-hexagon-v79-O2/SingleSource/Regression/C++/EH/Output/Regression-C++-ctor_dtor_count.test.out /l ocal/mnt/workspace/upstream/toolchain_for_hexagon/obj_test-suite_target-hexagon-v79-O2/SingleSource/Regression/C++/EH/ctor_dtor_count.r eference_output + /local/mnt/workspace/upstream/toolchain_for_hexagon/obj_test-suite_target-hexagon-v79-O2/tools/fpcmp /local/mnt/workspace/upstream/to olchain_for_hexagon/obj_test-suite_target-hexagon-v79-O2/SingleSource/Regression/C++/EH/Output/Regression-C++-ctor_dtor_count.test.out /local/mnt/workspace/upstream/toolchain_for_hexagon/obj_test-suite_target-hexagon-v79-O2/SingleSource/Regression/C++/EH/ctor_dtor_count .reference_output /local/mnt/workspace/upstream/toolchain_for_hexagon/obj_test-suite_target-hexagon-v79-O2/tools/fpcmp: Comparison failed, textual differ ence between 'l' and 'D' Input 1: libc++abi: terminating due to uncaught exception of type int exit 134 Input 2: Deriv ok! ``` test failures: ``` test-suite :: SingleSource/Regression/C++/EH/Regression-C++-class_hierarchy.test test-suite :: SingleSource/Regression/C++/EH/Regression-C++-ctor_dtor_count-2.test test-suite :: SingleSource/Regression/C++/EH/Regression-C++-ctor_dtor_count.test test-suite :: SingleSource/Regression/C++/EH/Regression-C++-exception_spec_test.test test-suite :: SingleSource/Regression/C++/EH/Regression-C++-function_try_block.test test-suite :: SingleSource/Regression/C++/EH/Regression-C++-inlined_cleanup.test test-suite :: SingleSource/Regression/C++/EH/Regression-C++-recursive-throw.test test-suite :: SingleSource/Regression/C++/EH/Regression-C++-simple_rethrow.test test-suite :: SingleSource/Regression/C++/EH/Regression-C++-simple_throw.test test-suite :: SingleSource/Regression/C++/EH/Regression-C++-throw_rethrow_test.test ``` ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129749] [DirectX] Update DXContainer binary format documentation to describe Root Descriptors representation
Issue 129749 Summary [DirectX] Update DXContainer binary format documentation to describe Root Descriptors representation Labels new issue Assignees joaosaffran Reporter joaosaffran Update https://github.com/llvm/llvm-project/blob/main/llvm/docs/DirectX/DXContainer.rst file to detail the expected binary representation of Root Signature Root Descriptor parameters. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129748] Missed optimizations with -fstack-protector-strong
Issue 129748 Summary Missed optimizations with -fstack-protector-strong Labels new issue Assignees Reporter travisdowns With -fstack-protector-strong, stack canary checks are even when addresses of local variables (or function parameters, etc) escape the function (or escape the fully inlined function). However, the optimization seems poor in this area: the stack checks are inserted even when the only writes to the stack on the hot path are provably-safe compiler spills. For example, consider this case: ``` [[noreturn]] [[gnu::cold]] void rare_function(const int& x, const int& y); int hot_function(int x, int y) { if (x < y) [[unlikely]] { rare_function(x, y); } return x + y; } ``` This is a reduced test case from a rich assert mechanism and this pattern is very common: we call `rare_function` very infrequently (in the case of assertions, at most once per-process invocation). `x` and `y` are passed in registers, so the hot path could simply be a comparison, jump to slow path and then return value. Instead we get: ``` hot_function(int, int): sub rsp, 24 mov rax, qword ptr fs:[40] mov qword ptr [rsp + 16], rax mov dword ptr [rsp + 12], edi mov dword ptr [rsp + 8], esi cmp edi, esi jl .LBB0_1 mov rax, qword ptr fs:[40] cmp rax, qword ptr [rsp + 16] jne .LBB0_5 add esi, edi mov eax, esi add rsp, 24 ret .LBB0_1: mov rax, qword ptr fs:[40] cmp rax, qword ptr [rsp + 16] jne .LBB0_5 lea rdi, [rsp + 12] lea rsi, [rsp + 8] call rare_function(int const&, int const&)@PLT .LBB0_5: call __stack_chk_fail@PLT ``` Note that on the hot path we store the stack cookie, spill the register variables, then do the comparison with reigsters and immediately load + compare the cookie: but there are no user-controlled or dangerous writes to the stack here at all, only spills to known slots, which are statically known to be disjoint from the cookie location. If instead `rare_function` takes its parameters by value, the whole function reduces to: ``` hot_function(int, int): cmp edi, esi jl .LBB0_2 add esi, edi mov eax, esi ret .LBB0_2: push rax call rare_function(int, int)@PLT ``` The hot path of the by-reference function could look the same! ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129746] `-isystem` does not suppress warnings from macros
Issue 129746 Summary `-isystem` does not suppress warnings from macros Labels new issue Assignees Reporter elbeno Given a header `provoke_warning.hpp`: ```cpp enum struct E { A, B, C }; namespace N { using enum E; } #define PROVOKE_WARNING using enum E; ``` And a source file `main.cpp`: ```cpp #include struct S { PROVOKE_WARNING; }; ``` When compile with: `clang -std=c++17 -isystem . -c main.cpp` We get: ```console main.cpp:4:3: warning: using enum declaration is a C++20 extension [-Wc++20-extensions] 4 | PROVOKE_WARNING; | ^ ./provoke_warning.hpp:7:31: note: expanded from macro 'PROVOKE_WARNING' 7 | #define PROVOKE_WARNING using enum E; | ^ ``` Notice that the warning we _would_ get from the `using enum` inside the namespace is not emitted (because of `-isystem`). However the use of the macro still gives a warning despite the fact that the macro came from a header included under `-isystem`. -- I see why this could be difficult to remedy, and I already see the counterarguments: "it's a macro! It's in your code, not the library code!" So it's understandable why this happens, but that doesn't make it correct or expected. This comes up in real code: https://github.com/catchorg/Catch2/issues/2910 ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129750] Missed optimization: eager spills mess up hot path
Issue 129750 Summary Missed optimization: eager spills mess up hot path Labels new issue Assignees Reporter travisdowns Consider the following function: ``` [[noreturn]] [[gnu::cold]] void cold_function(const int& x, const int& y); int hot_function(int x, int y) { if (x < y) [[unlikely]] { cold_function(x, y); } return x + y; } ``` In clang++ this generates the following code at -O3: ``` hot_function(int, int): push rax mov dword ptr [rsp + 4], edi mov dword ptr [rsp], esi cmp edi, esi jl .LBB0_2 add esi, edi mov eax, esi pop rcx ret .LBB0_2: lea rdi, [rsp + 4] mov rsi, rsp call cold_function(int const&, int const&)@PLT ``` However the whole spilling of the in-register variables, and the alignment of the stack frame (`push rax`) could be deferred to the cold branch instead: ``` hot_function(int, int): cmp edi, esi jl .LBB0_2 add esi, edi mov eax, esi ret .LBB0_2: push rax mov dword ptr [rsp + 4], edi mov dword ptr [rsp], esi lea rdi, [rsp + 4] mov rsi, rsp call cold_function(int const&, int const&)@PLT ``` Cutting the hot path almost in half and avoiding an expensive store-forwarding stall (`pop rax` reads the qword at `[rsp]` which was immediately before written in two dword halves during the spill, this causes an expensive (~10ish cycles) stall on all modern big cores I'm aware of). https://godbolt.org/z/nTvnj4r1r ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129747] c++23 std::ranges::copy_n advances InputIterator one more time than necessary
Issue 129747 Summary c++23 std::ranges::copy_n advances InputIterator one more time than necessary Labels new issue Assignees Reporter Be3y4uu-K0T Repeated error, but for copy_n from ranges. [(old resolved bug)](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50119) Test program ([godbolt](https://godbolt.org/z/h6cbjqPT7)): ```c++ #include #include #include #include int main() { std::istringstream s("1 2 3 4 5"); std::vector v; std::ranges::copy_n(std::istream_iterator(s), 2, std::back_inserter(v)); std::ranges::copy_n(std::istream_iterator(s), 2, back_inserter(v)); std::ranges::copy(v, std::ostream_iterator(std::cout, " ")); std::cout << '\n'; } ``` Run: `clang++ -std=c++23 index.cc -o index && ./index` Actual output: `1 2 4 5` Expected output: `1 2 3 4` Environment: ``` % clang++ -v Homebrew clang version 18.1.8 Target: arm64-apple-darwin24.3.0 Thread model: posix InstalledDir: /opt/homebrew/bin ``` ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129808] Increased memory consumption in ParentMapContext after clang-19
Issue 129808 Summary Increased memory consumption in ParentMapContext after clang-19 Labels new issue Assignees Reporter michael-jabbour-sonarsource The memory increase can be observed when enabling any check that uses `ASTContext::getParents` in `clang-tidy`. By plotting the memory consumption when analyzing a sample file, I got the following chart which shows around 10x increase in memory consumption in clang-tidy-19 compared to clang-tidy-18 when the number of elements in the array is large enough:  Here is a script that generates the above plot: ```python import matplotlib.pyplot as plt import subprocess def generate_cpp_file(file_name, num_elements): elements = [100] * num_elements elements_str = ', '.join(map(hex, elements)) with open(file_name, 'w') as f: f.write(f""" const char large_array[] = {{ {elements_str} }}; """) def measure_memory_consumption_with(clang_tidy_bin, num_elements): file_name = f'file_{num_elements}.cpp' generate_cpp_file(file_name, num_elements) process = subprocess.run(['/usr/bin/time', '-f', '%M', clang_tidy_bin, '-checks=readability-magic-numbers', file_name, '--', '-std=c++17'], check=True, capture_output=True) memory_kb = int(process.stderr) return memory_kb // 1024 def plot_memory_consumption(): num_elements = [10 ** 1, 10 ** 2, 10 ** 3, 10 ** 4, 10 ** 5, 10 ** 6, 3 * 10 ** 6, 5 * 10 ** 6, 8 * 10 ** 6] memory_consumption_17 = [measure_memory_consumption_with('clang-tidy-17', n) for n in num_elements] memory_consumption_18 = [measure_memory_consumption_with('clang-tidy-18', n) for n in num_elements] memory_consumption_19 = [measure_memory_consumption_with('clang-tidy-19', n) for n in num_elements] plt.plot(num_elements, memory_consumption_17, label='clang-tidy-17') plt.plot(num_elements, memory_consumption_18, label='clang-tidy-18') plt.plot(num_elements, memory_consumption_19, label='clang-tidy-19') plt.xlabel('Number of elements') plt.ylabel('Memory consumption (MB)') plt.legend() plt.show() if __name__ == '__main__': plot_memory_consumption() ``` I have installed `clang-tidy` binaries in this test from apt.llvm.org on an Ubuntu machine. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129829] UNREACHABLE executed at /root/build/tools/clang/include/clang/Sema/AttrSpellingListIndex.inc:14!
Issue 129829 Summary UNREACHABLE executed at /root/build/tools/clang/include/clang/Sema/AttrSpellingListIndex.inc:14! Labels clang Assignees Reporter bi6c Compiler Explorer: https://godbolt.org/z/x3hMvbd3a ```console Ignored/unknown shouldn't get here UNREACHABLE executed at /root/build/tools/clang/include/clang/Sema/AttrSpellingListIndex.inc:14! PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: 0. Program arguments: /opt/compiler-explorer/clang-assertions-trunk/bin/clang -gdwarf-4 -g -o /app/output.s -mllvm --x86-asm-syntax=intel -fno-verbose-asm -S --gcc-toolchain=/opt/compiler-explorer/gcc-snapshot -fcolor-diagnostics -fno-crash-diagnostics 1. :5:68: current parser token ';' #0 0x03e5b938 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x3e5b938) #1 0x03e595f4 llvm::sys::CleanupOnSignal(unsigned long) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x3e595f4) #2 0x03da5f28 CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0 #3 0x7a77b4042520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520) #4 0x7a77b40969fc pthread_kill (/lib/x86_64-linux-gnu/libc.so.6+0x969fc) #5 0x7a77b4042476 gsignal (/lib/x86_64-linux-gnu/libc.so.6+0x42476) #6 0x7a77b40287f3 abort (/lib/x86_64-linux-gnu/libc.so.6+0x287f3) #7 0x03db188a (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x3db188a) #8 0x07c5f33c clang::AttributeCommonInfo::calculateAttributeSpellingListIndex() const (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x7c5f33c) #9 0x0743454b clang::AsmLabelAttr::getSpelling() const (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x743454b) #10 0x0735c620 clang::FormatASTNodeDiagnosticArgument(clang::DiagnosticsEngine::ArgumentKind, long, llvm::StringRef, llvm::StringRef, llvm::ArrayRef>, llvm::SmallVectorImpl&, void*, llvm::ArrayRef) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x735c620) #11 0x04090d45 clang::Diagnostic::FormatDiagnostic(char const*, char const*, llvm::SmallVectorImpl&) const (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x4090d45) #12 0x04b4302f clang::TextDiagnosticPrinter::HandleDiagnostic(clang::DiagnosticsEngine::Level, clang::Diagnostic const&) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x4b4302f) #13 0x0409ef1e clang::DiagnosticIDs::EmitDiag(clang::DiagnosticsEngine&, clang::DiagnosticBuilder const&, clang::DiagnosticIDs::Level) const (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x409ef1e) #14 0x0409f458 clang::DiagnosticIDs::ProcessDiag(clang::DiagnosticsEngine&, clang::DiagnosticBuilder const&) const (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x409f458) #15 0x040900af clang::DiagnosticsEngine::EmitDiagnostic(clang::DiagnosticBuilder const&, bool) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x40900af) #16 0x0659c164 clang::Sema::EmitDiagnostic(unsigned int, clang::DiagnosticBuilder const&) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x659c164) #17 0x06614d48 clang::SemaBase::ImmediateDiagBuilder::~ImmediateDiagBuilder() (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x6614d48) #18 0x06589d03 clang::SemaBase::SemaDiagnosticBuilder::~SemaDiagnosticBuilder() (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x6589d03) #19 0x067881bd checkNonMultiVersionCompatAttributes(clang::Sema&, clang::FunctionDecl const*, clang::FunctionDecl const*, clang::MultiVersionKind)::'lambda'(clang::Sema&, clang::Attr const*)::operator()(clang::Sema&, clang::Attr const*) const SemaDecl.cpp:0:0 #20 0x067884d1 checkNonMultiVersionCompatAttributes(clang::Sema&, clang::FunctionDecl const*, clang::FunctionDecl const*, clang::MultiVersionKind) SemaDecl.cpp:0:0 #21 0x06788c3f CheckMultiVersionAdditionalRules(clang::Sema&, clang::FunctionDecl const*, clang::FunctionDecl const*, bool, clang::MultiVersionKind) SemaDecl.cpp:0:0 #22 0x067c905d CheckMultiVersionFunction(clang::Sema&, clang::FunctionDecl*, bool&, clang::NamedDecl*&, clang::LookupResult&) SemaDecl.cpp:0:0 #23 0x067cae74 clang::Sema::CheckFunctionDeclaration(clang::Scope*, clang::FunctionDecl*, clang::LookupResult&, bool, bool) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x67cae74) #24 0x067d13b3 clang::Sema::ActOnFunctionDeclarator(clang::Scope*, clang::Declarator&, clang::DeclContext*, clang::TypeSourceInfo*, clang::LookupResult&, llvm::MutableArrayRef, bool&) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x67d13b3) #25 0
[llvm-bugs] [Bug 129843] LLVM 20 miscompiles `@llvm.ctpop.i128` for `aarch64_be`
Issue 129843 Summary LLVM 20 miscompiles `@llvm.ctpop.i128` for `aarch64_be` Labels backend:AArch64, regression, miscompilation Assignees Reporter alexrp Consider this Zig program: ```zig pub fn main() void { var x: u128 = 0b00011000110001111111100101010001; _ = &x; @import("std").process.exit(@popCount(x)); } ``` Running it with `qemu-aarch64_be` will produce `24` with LLVM 19, but `0` with LLVM 20. Isolating the `@llvm.ctpop.i128` a bit: ```llvm ; ModuleID = 'BitcodeBuffer' source_filename = "repro" target datalayout = "E-m:e-p270:32:32-p271:32:32-p272:64:64-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128-Fn32" target triple = "aarch64_be-unknown-linux4.19.0-unknown" @builtin.zig_backend = internal unnamed_addr constant i64 2, align 8 @start.simplified_logic = internal unnamed_addr constant i1 false, align 1 @builtin.output_mode = internal unnamed_addr constant i2 -2, align 1 ; Function Attrs: nosanitize_coverage nounwind skipprofile define dso_local i32 @repro() #0 { %1 = alloca [16 x i8], align 16 store i128 71803349708323153, ptr %1, align 16 %2 = load i128, ptr %1, align 16 %3 = call i128 @llvm.ctpop.i128(i128 %2) %4 = trunc i128 %3 to i8 %5 = zext i8 %4 to i32 ret i32 %5 } ; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none) declare i128 @llvm.ctpop.i128(i128) #1 attributes #0 = { nosanitize_coverage nounwind skipprofile "frame-pointer"="all" "target-cpu"="generic" "target-features"="+enable-select-opt,+ete,+fp-armv8,+fuse-adrp-add,+fuse-aes,+neon,+trbe,+use-postra-scheduler,-addr-lsl-slow-14,-aes,-aggressive-fma,-alternate-sextload-cvt-f32-pattern,-altnzcv,-alu-lsl-fast,-am,-amvs,-arith-bcc-fusion,-arith-cbz-fusion,-ascend-store-address,-avoid-ldapur,-balance-fp-ops,-bf16,-brbe,-bti,-call-saved-x10,-call-saved-x11,-call-saved-x12,-call-saved-x13,-call-saved-x14,-call-saved-x15,-call-saved-x18,-call-saved-x8,-call-saved-x9,-ccdp,-ccidx,-ccpp,-chk,-clrbhb,-cmp-bcc-fusion,-cmpbr,-complxnum,-CONTEXTIDREL2,-cpa,-crc,-crypto,-cssc,-d128,-disable-latency-sched-heuristic,-disable-ldp,-disable-stp,-dit,-dotprod,-ecv,-el2vmsa,-el3,-exynos-cheap-as-move,-f32mm,-f64mm,-f8f16mm,-f8f32mm,-faminmax,-fgt,-fix-cortex-a53-835769,-flagm,-fmv,-force-32bit-jump-tables,-fp16fml,-fp8,-fp8dot2,-fp8dot4,-fp8fma,-fpac,-fprcvt,-fptoint,-fujitsu-monaka,-fullfp16,-fuse-address,-fuse-addsub-2reg-const1,-fuse-arith-logic,-fuse-crypto-eor,-fuse-csel,-fuse-literals,-gcs,-harden-sls-blr,-harden-sls-nocomdat,-harden-sls-retbr,-hbc,-hcx,-i8mm,-ite,-jsconv,-ldp-aligned-only,-lor,-ls64,-lse,-lse128,-lse2,-lsfe,-lsui,-lut,-mec,-mops,-mpam,-mte,-nmi,-no-bti-at-return-twice,-no-neg-immediates,-no-sve-fp-ld1r,-no-zcz-fp,-nv,-occmo,-outline-atomics,-pan,-pan-rwv,-pauth,-pauth-lr,-pcdphint,-perfmon,-pops,-predictable-select-expensive,-predres,-prfm-slc-target,-rand,-ras,-rasv2,-rcpc,-rcpc3,-rcpc-immo,-rdm,-reserve-lr-for-ra,-reserve-x1,-reserve-x10,-reserve-x11,-reserve-x12,-reserve-x13,-reserve-x14,-reserve-x15,-reserve-x18,-reserve-x2,-reserve-x20,-reserve-x21,-reserve-x22,-reserve-x23,-reserve-x24,-reserve-x25,-reserve-x26,-reserve-x27,-reserve-x28,-reserve-x3,-reserve-x4,-reserve-x5,-reserve-x6,-reserve-x7,-reserve-x9,-rme,-sb,-sel2,-sha2,-sha3,-slow-misaligned-128store,-slow-paired-128,-slow-strqro-store,-sm4,-sme,-sme2,-sme2p1,-sme2p2,-sme-b16b16,-sme-f16f16,-sme-f64f64,-sme-f8f16,-sme-f8f32,-sme-fa64,-sme-i16i64,-sme-lutv2,-sme-mop4,-sme-tmop,-spe,-spe-eef,-specres2,-specrestrict,-ssbs,-ssve-aes,-ssve-bitperm,-ssve-fp8dot2,-ssve-fp8dot4,-ssve-fp8fma,-store-pair-suppress,-stp-aligned-only,-strict-align,-sve,-sve2,-sve2-aes,-sve2-bitperm,-sve2-sha3,-sve2-sm4,-sve2p1,-sve2p2,-sve-aes,-sve-aes2,-sve-b16b16,-sve-bfscale,-sve-bitperm,-sve-f16f32mm,-tagged-globals,-the,-tlb-rmi,-tlbiw,-tme,-tpidr-el1,-tpidr-el2,-tpidr-el3,-tpidrro-el0,-tracev8.4,-uaops,-use-experimental-zeroing-pseudos,-use-fixed-over-scalable-if-equal-cost,-use-reciprocal-square-root,-v8.1a,-v8.2a,-v8.3a,-v8.4a,-v8.5a,-v8.6a,-v8.7a,-v8.8a,-v8.9a,-v8a,-v8r,-v9.1a,-v9.2a,-v9.3a,-v9.4a,-v9.5a,-v9.6a,-v9a,-vh,-wfxt,-xs,-zcm,-zcz,-zcz-fp-workaround,-zcz-gp" } attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) } !llvm.module.flags = !{} ``` Compiling this with `llc repro.ll -O0` with LLVM 19 and 20 yields this codegen diff: ```diff --- repro.19.s 2025-03-05 08:29:31.485173087 +0100 +++ repro.20.s 2025-03-05 08:29:34.672295525 +0100 @@ -1,5 +1,5 @@ - .text .file "repro" + .text .globl repro // -- Begin function repro .p2align2 .type repro,@function @@ -16,15 +16,16 @@ mov x8, xzr str x8, [sp] ldr x8, [sp, #8] - ldr d1, [sp] -
[llvm-bugs] [Bug 129796] [DirectX] Update Root Signature Binary Representation docs to describe Descriptor tables.
Issue 129796 Summary [DirectX] Update Root Signature Binary Representation docs to describe Descriptor tables. Labels new issue Assignees joaosaffran Reporter joaosaffran Update https://github.com/llvm/llvm-project/blob/main/llvm/docs/DirectX/DXContainer.rst file to detail the expected binary representation of Root Signature Root Descriptor tables parameters. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129803] [libc++] `std::variant` introduces padding if a variant member contains a variant
Issue 129803 Summary [libc++] `std::variant` introduces padding if a variant member contains a variant Labels libc++ Assignees Reporter zygoloid [Testcase](https://godbolt.org/z/r7WEoEzTh): ```c++ #include struct A { int x; }; struct B { int y; int z; }; static_assert(sizeof(B) == 8); static_assert(sizeof(std::variant) == 12); struct C { std::variant v; }; static_assert(sizeof(C) == 8); static_assert(sizeof(std::variant) == 16); ``` `variant` ought to be only 12 bytes, but is actually 16 bytes. The reason for this is that `std::variant` derives from `__sfinae_ctor_base<...>` and `__sfinae_assign_base<...>`, and those base classes are the *same* for `std::variant` and for `std::variant`. This prevents the variant's first field (the `__union`) from being put at offset 0 within the variant, because that would mean we have two different `__sfinae_ctor_base<...>` subobjects at the same offset within the same object, and the C++ language rules don't permit that struct layout. The solution is to change `variant` so that it doesn't derive from a class that is, or can be, independent of the `variant`'s template arguments. Perhaps either change the `__sfinae_...` types to use CRTP (even though they don't care what the derived class is), or remove them and rely on getting the special members' properties from the `__impl` type instead. Of course, fixing this will break `std::variant`'s ABI, so it'd need to be done only in the unstable ABI. :( `std::optional` appears to use the same implementation strategy, so I would imagine it has the same deficiency (assuming it puts the `T` first, not the `bool`), but I've not checked. And it looks like `std::tuple` may also suffer from the same issue. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129805] Failure to spot `popcount` idiom
Issue 129805 Summary Failure to spot `popcount` idiom Labels missed-optimization Assignees Reporter Kmeakin LLVM does a valiant effort of unrolling and vectorizing these loops, but they're really just `popcount` and it should recognize them as such ```c++ #include using u8 = uint8_t; using u16 = uint16_t; using u32 = uint32_t; using u64 = uint64_t; template auto src(T x) -> u64 { u64 count = 0; for (u64 i = 0; i < sizeof(T) * 8; i++) { if (x & ((u64)1 << i)) { count++; } } return count; } template auto tgt(T x) -> u64 { return __builtin_popcountg(x); } extern "C" { auto src8(u8 x) -> u64 { return src(x); } auto src16(u8 x) -> u64 { return src(x); } auto src32(u8 x) -> u64 { return src(x); } auto src64(u8 x) -> u64 { return src(x); } auto tgt8(u8 x) -> u64 { return tgt(x); } auto tgt16(u8 x) -> u64 { return tgt(x); } auto tgt32(u8 x) -> u64 { return tgt(x); } auto tgt64(u8 x) -> u64 { return tgt(x); } } ``` ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129832] Assertion `!isa(static_cast(this)) || cast(static_cast(this))->isLinkageValid()' failed.
Issue 129832 Summary Assertion `!isa(static_cast(this)) || cast(static_cast(this))->isLinkageValid()' failed. Labels new issue Assignees Reporter bi6c Compiler Explorer: https://godbolt.org/z/Tb1YaaPs4 ```console :7:12: warning: #pragma redefine_extname is applicable to external C declarations only; not applied to function 'foo' [-Wpragmas] 7 | static int foo(void); | ^ clang: /root/llvm-project/clang/include/clang/AST/Decl.h:5157: void clang::Redeclarable::setPreviousDecl(decl_type*) [with decl_type = clang::FunctionDecl]: Assertion `!isa(static_cast(this)) || cast(static_cast(this))->isLinkageValid()' failed. PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: 0. Program arguments: /opt/compiler-explorer/clang-assertions-trunk/bin/clang -gdwarf-4 -g -o /app/output.s -mllvm --x86-asm-syntax=intel -fno-verbose-asm -S --gcc-toolchain=/opt/compiler-explorer/gcc-snapshot -fcolor-diagnostics -fno-crash-diagnostics 1. :8:21: current parser token ';' #0 0x03e53898 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x3e53898) #1 0x03e51554 llvm::sys::CleanupOnSignal(unsigned long) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x3e51554) #2 0x03d9de88 CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0 #3 0x7d7f66c42520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520) #4 0x7d7f66c969fc pthread_kill (/lib/x86_64-linux-gnu/libc.so.6+0x969fc) #5 0x7d7f66c42476 gsignal (/lib/x86_64-linux-gnu/libc.so.6+0x42476) #6 0x7d7f66c287f3 abort (/lib/x86_64-linux-gnu/libc.so.6+0x287f3) #7 0x7d7f66c2871b (/lib/x86_64-linux-gnu/libc.so.6+0x2871b) #8 0x7d7f66c39e96 (/lib/x86_64-linux-gnu/libc.so.6+0x39e96) #9 0x073739ac clang::Redeclarable::setPreviousDecl(clang::FunctionDecl*) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x73739ac) #10 0x074ee3d5 clang::FunctionDecl::setPreviousDeclaration(clang::FunctionDecl*) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x74ee3d5) #11 0x067b7420 clang::Sema::CheckFunctionDeclaration(clang::Scope*, clang::FunctionDecl*, clang::LookupResult&, bool, bool) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x67b7420) #12 0x067bc5f0 clang::Sema::ActOnFunctionDeclarator(clang::Scope*, clang::Declarator&, clang::DeclContext*, clang::TypeSourceInfo*, clang::LookupResult&, llvm::MutableArrayRef, bool&) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x67bc5f0) #13 0x067c1530 clang::Sema::HandleDeclarator(clang::Scope*, clang::Declarator&, llvm::MutableArrayRef) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x67c1530) #14 0x067c2070 clang::Sema::ActOnDeclarator(clang::Scope*, clang::Declarator&) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x67c2070) #15 0x0642fd1e clang::Parser::ParseDeclarationAfterDeclaratorAndAttributes(clang::Declarator&, clang::Parser::ParsedTemplateInfo const&, clang::Parser::ForRangeInit*) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x642fd1e) #16 0x0643f8c9 clang::Parser::ParseDeclGroup(clang::ParsingDeclSpec&, clang::DeclaratorContext, clang::ParsedAttributes&, clang::Parser::ParsedTemplateInfo&, clang::SourceLocation*, clang::Parser::ForRangeInit*) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x643f8c9) #17 0x063ff73e clang::Parser::ParseDeclOrFunctionDefInternal(clang::ParsedAttributes&, clang::ParsedAttributes&, clang::ParsingDeclSpec&, clang::AccessSpecifier) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x63ff73e) #18 0x063ffef9 clang::Parser::ParseDeclarationOrFunctionDefinition(clang::ParsedAttributes&, clang::ParsedAttributes&, clang::ParsingDeclSpec*, clang::AccessSpecifier) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x63ffef9) #19 0x064076d3 clang::Parser::ParseExternalDeclaration(clang::ParsedAttributes&, clang::ParsedAttributes&, clang::ParsingDeclSpec*) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x64076d3) #20 0x064085ad clang::Parser::ParseTopLevelDecl(clang::OpaquePtr&, clang::Sema::ModuleImportState&) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x64085ad) #21 0x063faa3a clang::ParseAST(clang::Sema&, bool, bool) (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x63faa3a) #22 0x04812598 clang::CodeGenAction::ExecuteAction() (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x4812598) #23 0x04ada245 clang::FrontendAction::Execute() (/opt/compiler-explorer/clang-assertions-trunk/bin/clang+0x4ada245) #24 0x04a5d92e clang::C
[llvm-bugs] [Bug 129838] [libc] str_to_float_comparison_test should be hermetic
Issue 129838 Summary [libc] str_to_float_comparison_test should be hermetic Labels libc Assignees Reporter RossComputerGuy The test in question, https://github.com/llvm/llvm-project/blob/main/libc/test/src/__support/str_to_float_comparison_test.cpp, is not hermetic right now. This causes problems for NixOS/nixpkgs where full builds use clang without a libc. Being able to run all tests without needing the host's libc would be very beneficial. Relevant log: ``` libc> [1145/1151] Building CXX object libc/test/src/__support/CMakeFiles/libc_str_to_float_comparison_test.dir/str_to_float_comparison_test.cpp.o libc> FAILED: libc/test/src/__support/CMakeFiles/libc_str_to_float_comparison_test.dir/str_to_float_comparison_test.cpp.o libc> /nix/store/h3wgz6n8bc4n61vv427xl8cz69vcd96c-clang-wrapper-20.1.0-rc3/bin/clang++ -DLIBC_NAMESPACE=__llvm_libc_20_1_0_rc3 -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wno-comment -Wstring-conversion -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -std=gnu++17 -MD -MT libc/test/src/__support/CMakeFiles/libc_str_to_float_comparison_test.dir/str_to_float_comparison_test.cpp.o -MF libc/test/src/__support/CMakeFiles/libc_str_to_float_comparison_test.dir/str_to_float_comparison_test.cpp.o.d -o libc/test/src/__support/CMakeFiles/libc_str_to_float_comparison_test.dir/str_to_float_comparison_test.cpp.o -c /build/libc-src-20.1.0-rc3/libc/test/src/__support/str_to_float_comparison_test.cpp libc> In file included from /build/libc-src-20.1.0-rc3/libc/test/src/__support/str_to_float_comparison_test.cpp:11: libc> In file included from /nix/store/l71wz2r8ki25kzw33jwssg8rh77xfkpr-gcc-14-20241116/include/c++/14-20241116/stdlib.h:36: libc> In file included from /nix/store/l71wz2r8ki25kzw33jwssg8rh77xfkpr-gcc-14-20241116/include/c++/14-20241116/cstdlib:41: libc> In file included from /nix/store/l71wz2r8ki25kzw33jwssg8rh77xfkpr-gcc-14-20241116/include/c++/14-20241116/aarch64-unknown-linux-gnu/bits/c++config.h:680: libc> /nix/store/l71wz2r8ki25kzw33jwssg8rh77xfkpr-gcc-14-20241116/include/c++/14-20241116/aarch64-unknown-linux-gnu/bits/os_defines.h:39:10: fatal error: 'features.h' file not found libc>39 | #include libc> | ^~~~ libc> 1 error generated. libc> [1146/1151] Building CXX object libc/src/stdlib/CMakeFiles/libc.src.stdlib.strfromf.dir/strfromf.cpp.o libc> [1147/1151] Building CXX object libc/src/stdlib/CMakeFiles/libc.src.stdlib.strfromd.dir/strfromd.cpp.o libc> [1148/1151] Building CXX object libc/src/stdio/printf_core/CMakeFiles/libc.src.stdio.printf_core.converter.dir/converter.cpp.o libc> ninja: build stopped: subcommand failed. ``` ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129841] Dead code in MLRegAllocEvictAdvisor.cpp
Issue 129841 Summary Dead code in MLRegAllocEvictAdvisor.cpp Labels mlgo Assignees Reporter abhishek-kaushik22 In [MLRegAllocEvictAdvisor.cpp](https://github.com/llvm/llvm-project/blob/main/llvm/lib/CodeGen/MLRegAllocEvictAdvisor.cpp#L862-L874), the condition `if (CandidatePos == CandidateVirtRegPos)` is checked twice but the first time it's true, the function returns making the second condition check unnecessary. @boomanaiden154 can you please take a look because this was introduced in 00f692b94f9aa08ede4aaba6f2aafe17857599c4 ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129842] [asan] failure to detect memory leaks
Issue 129842 Summary [asan] failure to detect memory leaks Labels new issue Assignees Reporter PikachuHyA reproducer see https://godbolt.org/z/6YGdnn634 ```c++ // main.cc struct Foo { struct Foo *other; }; int main() { auto f1 = new Foo(); auto f2 = new Foo(); f1->other = f2; f2->other = f1; return 0; } ``` However, the following memory leaks detected. see https://godbolt.org/z/n6jWbYTqY ```c++ struct Foo { struct Foo *other; }; int main() { auto f1 = new Foo(); auto f2 = new Foo(); f1->other = f2; // highlight here // f2->other = f1; return 0; } ``` Note: GCC can detect the memory leaks. if use `-fsanitize=leak`, the memory leaks detected. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129845] accepts-invalid with C++23 constexpr-unknown with struct containing reference
Issue 129845 Summary accepts-invalid with C++23 constexpr-unknown with struct containing reference Labels clang:frontend, c++23, constexpr Assignees Reporter efriedma-quic ``` int &ff(); int &x = ff(); struct A { int& x; }; constexpr A g = {x}; const A* gg = &g; ``` Should be rejected, currently accepted. (And related variations miscompile.) ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129675] [clang-tidy] bugprone-throw-keyword-missing on default member initializer: "did you mean 'throw shared_ptr'?"
Issue 129675 Summary [clang-tidy] bugprone-throw-keyword-missing on default member initializer: "did you mean 'throw shared_ptr'?" Labels clang-tidy Assignees Reporter N-Dekker Using LLVM 19.1.7, I encountered a false positive `bugprone-throw-keyword-missing` from [ITK](https://itk.org)'s [itkExceptionObject.h](https://github.com/InsightSoftwareConsortium/ITK/blob/32a2a6de17ffb7c8319ab38dbe61bd3b7c171f00/Modules/Core/Common/include/itkExceptionObject.h), which can be reproduced as follows: ```cpp #include #include class MyException : public std::exception { public: MyException() = default; private: class NestedData; std::shared_ptr m_shared_data{}; }; class Bug : public MyException { public: Bug() { // Non-defaulted default constructor. } }; ``` Output: ``` warning: suspicious exception object created but not thrown; did you mean 'throw shared_ptr'? [bugprone-throw-keyword-missing] 10 | std::shared_ptr m_shared_data{}; |^``` ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129676] Clang emits not the smallest code with `-Os` for `(unsigned)x >> C1 == C2`
Issue 129676 Summary Clang emits not the smallest code with `-Os` for `(unsigned)x >> C1 == C2` Labels Assignees Reporter Explorer09 ```c unsigned int pred2_rshift(unsigned int x) { return (x >> 11) == 0x1B; // 0x1B == (0xD800 >> 11); } unsigned int pred2_bitand(unsigned int x) { return (x &= ~0x7FF) == 0xD800; } ``` When tested on Compiler Explorer, x86-64 clang 19.1.0, with `-Os` option, `pred2_rshift` translates to `pred2_bitand` which is slightly larger code. My expected result is the right shift should be used, as I specify `-Os` I expect smallest code size. Note: The example code is part of [a report I reported to GCC](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115529). When checking whether an integer is in a specified range, and the range happens to be aligned to a power of two, then all of these comparisons can do the same thing: ```c unsigned int pred2(unsigned int x) { return x >= 0xD800 && x <= 0xDFFF; } unsigned int pred2_sub(unsigned int x) { return (x - 0xD800) <= (0xDFFF - 0xD800); } unsigned int pred2_bitand(unsigned int x) { return (x &= ~0x7FF) == 0xD800; } unsigned int pred2_bitor(unsigned int x) { return (x |= 0x7FF) == 0xDFFF; } unsigned int pred2_rshift(unsigned int x) { return (x >>= 11) == (0xD800 >> 11); } unsigned int pred2_div(unsigned int x) { return (x / 0x800) == (0xD800 / 0x800); } ``` While Clang can recognize _all_ of these as equivalent (good job, by the way), it made a strange decision on which code to emit. While I can't tell which one is best for speed (performance), I can figure out which one is the smallest size. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129671] Confusing behaviour command line options with default value 'true'
Issue 129671 Summary Confusing behaviour command line options with default value 'true' Labels new issue Assignees Reporter JVApen In clang-include-cleaner, there are some options that have a default value 'true'. For example: https://github.com/llvm/llvm-project/blob/03505a004ff6909c46d6b8c498a9ffccd47d88a0/clang-tools-extra/include-cleaner/tool/IncludeCleaner.cpp#L100-L105 When using the --help for it, it tells the following: USAGE: clang-include-cleaner.exe [options] [... ] OPTIONS: ... --remove- Allow header removals --version - Display the version of this program This seems to imply that you have to add '--remove' in order to active the example option. However, this is enabled by default. If you don't want the 'removal' behavior, you have to add `--remove=false` to the command line. This is nowhere to be found in the help message. Can the command line output be improved such that default options are somehow indicated and it is easy to see how to disable them? For example ` --remove - Allow header removals (Default, use --remove=false to disable)` ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129779] [flang] surprising performance loss with nested type operator overloading
Issue 129779 Summary [flang] surprising performance loss with nested type operator overloading Labels flang Assignees Reporter ivan-pi I've attempted to create a performance benchmark which sums an array of numbers, but in different ways to measure the overhead of operator overloading for simple value types: [abstraction_penalty.F90.txt](https://github.com/user-attachments/files/19077916/abstraction_penalty.F90.txt) When I run the program, I see the output: ``` $ flang-new -O2 abstraction_penalty.F90 $ ./a.out [info] compiler: Homebrew flang version 19.1.4 (https://github.com/Homebrew/homebrew-core/issues) [info] compiler options: flang-new -O2 abstraction_penalty.F90 [info] using naive sum [info] number of iterations: 25000 testabsolute additions ratio with number time (sec) per second test0 0 0.0532 9.400E+02 1.000 1 0.0498 1.003E+03 0.937 2 0.0493 1.015E+03 0.926 3 0.0526 9.511E+02 0.988 4 0.0595 8.410E+02 1.118 5 0.0515 9.700E+02 0.969 6 0.0486 1.029E+03 0.913 7 0.0485 1.031E+03 0.912 8 0.0490 1.020E+03 0.922 9 0.0472 1.059E+03 0.888 10 0.0483 1.036E+03 0.907 11 0.0485 1.031E+03 0.912 12 0.0479 1.044E+03 0.901 13 0.0481 1.039E+03 0.905 14 6.7735 7.382E+00 127.336 15 6.7167 7.444E+00 126.267 16 0.0467 1.071E+03 0.878 17 0.0452 1.105E+03 0.850 18 0.0451 1.108E+03 0.849 19 0.0452 1.105E+03 0.850 20 0.0476 1.050E+03 0.895 21 0.0469 1.066E+03 0.882 22 0.0467 1.071E+03 0.877 23 0.0461 1.086E+03 0.866 24 0.0454 1.101E+03 0.853 25 0.0452 1.105E+03 0.851 26 0.0456 1.097E+03 0.857 27 0.0454 1.102E+03 0.853 28 6.6540 7.514E+00 125.089 29 6.5274 7.660E+00 122.709 mean 0.0928 5.386E+021.75 ``` The slow cases (14, 15, 28, 29) are calling the procedure `test_ddd`, which calls `dsum` for the `type(ddd)`, which is really just a double value but defined in a obscure way: ```fortran integer, parameter :: dp = c_double ! Double wrapper type :: dd real(dp) :: val end type ! Double wrapper child with TBP type, extends(dd) :: ddi contains procedure :: get => get_ddi_val end type ! Double wrapper wrapper type :: ddd type(dd) :: val end type ``` The sum procedure looks as follows: ```fortran pure function ddd_sum(a) result(res) type(ddd), intent(in) :: a(:) type(ddd) :: res real(dp), pointer :: t(:) #if USE_INTRINSIC_SUM res%val%val = sum(a%val%val) #else integer :: i res = ddd(dd(0.0_dp)) do i = 1, size(a) res = res + a(i) end do #endif end function ``` where the `+` is the overloaded `operator(+)` defined as, ```fortran pure function ddd_add(a,b) result(c) type(ddd), intent(in) :: a, b type(ddd) :: c c%val%val = a%val%val + b%val%val end function ``` If the intrinsic sum (`-DUSE_INTRINSIC_SUM`) is used instead, there are no observable penalties. There are other switches too, namely `-DUSE_INTRINSIC_REDUCE` which displays good performance, and `-DUSE_STRUCTURE_CONSTRUCTOR` which makes the performance even worse (300x slower than the baseline). ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129813] Documented option file for each clang-tidy option
Issue 129813 Summary Documented option file for each clang-tidy option Labels clang-tidy Assignees Reporter martinlicht The online documentation lists all available checks and their options: https://clang.llvm.org/extra/clang-tidy/checks/list.html How about providing a comprehensive list of clang-tidy checks and their options in the form of a config file? The default option should be some reasonable standard and the file should include some brief explanation (or link) in the comments for each option. Possible Example ``` # https://clang.llvm.org/extra/clang-tidy/checks/bugprone/argument-comment.html - key: bugprone-argument-comment.StrictMode value: false ''' I would love to have access to such a complete configuration file simply for the sake of playing around. Maybe there is a way to automate the generation of such a file? ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129815] (When) does Clang respect noinline, and how?
Issue 129815 Summary (When) does Clang respect noinline, and how? Labels clang Assignees Reporter higher-performance This issue appears to exist with GCC and MSVC as well, but (in my various attempts) Clang appears to be the least willing to respect noinline. Consider this code: ``` #include #if defined(_MSC_VER) #define NOINLINE [[msvc::noinline]] #elif defined(__clang__) #define NOINLINE [[gnu::noinline]] #elif defined(__GNUC__) #define NOINLINE [[gnu::noinline]] #else #error unable to prevent inlining #endif using R = int; using P = R*; NOINLINER bar1(P arg); NOINLINER bar2(P arg) { return arg ? 1 : 0; } NOINLINE static R bar3(P arg) { return arg ? 1 : 0; } R foo1(int x) { return (x ? 0 : bar1(&x)); } R foo2(int x) { return (x ? 0 : bar2(&x)); } R foo3(int x) { return (x ? 0 : bar3(&x)); } ``` In this code, all the `fooN` are equivalent, and: - Must contain calls to `barN` because all `barN` are noinline (mandatory) - Should result in identical codegen (optional, but preferable) Instead, [what I see](https://godbolt.org/z/G9zjYY1fe) is: - With the exception of `foo2` & `foo3` on MSVC, no pair of `fooN` result in identical codegen on any compiler - None of the compilers produce a `call` instruction in `foo3`, implying that `noinline` isn't guaranteeing the generation of a new stack frame - Clang is the only compiler that **completely elides** any reference to any `barN` (see `foo3`), making the `noinline` function disappear entirely in some cases. ``` Clang │ GCC │ MSVC ┿━━┿ foo1: │ foo1:│ foo1: push rax │ sub rsp, 24 │ mov [rsp+8], ecx mov [rsp+4], edi │ xor eax, eax │ sub rsp, 40 xor eax, eax │ mov [rsp+12], edi │ test ecx, ecx test edi, edi │ test edi, edi │ je LABEL je LABEL│ je LABEL │ xor eax, eax pop rcx │ add rsp, 24 │ add rsp, 40 ret │ ret│ ret 0 LABEL: │ LABEL: │ LABEL: lea rdi, [rsp+4] │ lea rdi, [rsp+12] │ lea rcx, [rsp] call bar1@PLT │ call bar1 │ call bar1 pop rcx │ add rsp, 24 │ add rsp, 40 ret │ ret│ ret 0 ┼──┼ foo2: │ foo2:│ foo2: xor eax, eax │ test edi, edi │ test ecx, ecx test edi, edi │ jne LABEL │ je LABEL je LABEL│ sub rsp, 8│ xor eax, eax ret │ lea rdi, [rsp+4] │ ret 0 LABEL: │ call bar2 │ LABEL: push rax │ add rsp, 8│ lea rcx, [rsp] lea rdi, [rsp+4] │ ret│ jmp bar2 call bar2 │ LABEL: │ add rsp, 8 │ xor eax, eax │ ret │ ret│ ┼──┼ foo3: │ foo3:│ foo3: xor eax, eax │ test edi, edi │ test ecx, ecx test edi, edi │ jne LABEL │ je LABEL sete al │ lea rdi, [rsp-4] │ xor eax, eax ret │ jmp bar2 │ ret 0 │ LABEL: │ LABEL: │ xor eax, eax │ lea rcx, [rsp] │ ret│ jmp bar3 ``` Note that I haven't tried LTO yet, but I imagine that would produce interesting results as well. This made me wonder: - What _are_ the precise semantics of `noinline`? i.e. what guarantee(s) can users actually rely on when using `noinline`, if any? - Are these behaviors bugs, or intended behavior? ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129812] Early exit optimization of Fortran array expressions
Issue 129812 Summary Early exit optimization of Fortran array expressions Labels new issue Assignees Reporter ivan-pi Consider a function for checking if an array is sorted: ```fortran ! ! Check an array of integers is sorted in ascending order ! logical function is_sorted_scalar(n,a) result(is_sorted) integer, intent(in) :: n integer, intent(in) :: a(n) integer :: i !$omp simd simdlen(8) early_exit do i = 2, n if (a(i) < a(i-1)) then is_sorted = .false. return end if end do is_sorted = .true. end function logical function is_sorted_all(n,a) result(is_sorted) integer, intent(in) :: n integer, intent(in) :: a(n) is_sorted = all(a(2:n) >= a(1:n-1)) end function program benchmark implicit none integer, allocatable :: a(:) integer :: i, n external :: is_sorted_scalar external :: is_sorted_all logical :: is_sorted_scalar logical :: is_sorted_all character(len=32) :: str integer :: tmp tmp = 0 n = 2 if (command_argument_count() > 0) then call get_command_argument(1,str) read(str,*) tmp if (tmp > 0) n = tmp end if print *, "n = ", n allocate(a(n)) ! Fill ascending numbers do i = 1, n a(i) = i end do ! Introduce an unsorted value a(100) = 1001 !a(101) = 1000 call measure(10,a,is_sorted_scalar,"scalar") call measure(10,a,is_sorted_all, "all") contains impure subroutine measure(nreps,a,func,name) integer, intent(in) :: nreps integer, intent(in) :: a(:) logical :: func character(len=*), intent(in) :: name integer(8) :: t1, t2, rate real(kind(1.0d0)) :: elapsed logical :: res character(len=12) :: str integer :: k call system_clock(t1) do k = 1, nreps res = func(size(a),a) end do call system_clock(t2,rate) elapsed = (t2 - t1)/real(rate,kind(elapsed)) str = adjustl(name) print '(A12,F12.4,L2)', str, elapsed/nreps*1.e6, res ! Time is in microseconds end subroutine end program ``` It appears to me that in `is_sorted_all` flang generates a temporary array for the `a(2:n) >= a(1:n-1)` _expression_, and then performs the `all` reduction. This is fast due to vectorization, but it missed the chance of early exit. The effect is noticeable in the runtime: ``` ~/fortran/is_sorted$ make FC=flang-new FFLAGS="-O2 -march=native" standalone flang-new -O2 -march=native -o standalone standalone.f90 ~/fortran/is_sorted$ ./standalone n = 2 scalar 0.0673 F all 1.7358 F ~/fortran/is_sorted$ make clean rm -rf *.o benchmark standalone ~/fortran/is_sorted$ make FC=gfortran FFLAGS="-O2 -march=native" standalone gfortran -O2 -march=native -o standalone standalone.f90 ~/fortran/is_sorted$ ./standalone n = 2 scalar0.0389 F all 0.0390 F ``` It would be nice if early exit vectorization were also supported (https://discourse.llvm.org/t/rfc-supporting-more-early-exit-loops/84690). With x86 SIMD extensions this still has to be done manually it seems: http://0x80.pl/notesen/2018-04-11-simd-is-sorted.html ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 129816] [LLD] Support placing OVERLAY in a specific MEMORY region in linker scripts
Issue 129816 Summary [LLD] Support placing OVERLAY in a specific MEMORY region in linker scripts Labels lld Assignees mysterymath Reporter Prabhuk ``` OVERLAY OVERLAY_ADDR : { .overlay1 { *(overlay1*) } .overlay { *(overlay2*) } } > MEM_REGION ``` In LLD, the `> MEM_REGION` semantic for OVERLAY is unsupported currently. This issue tracks adding support for this feature in LLD. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs