[llvm-bugs] [Bug 132038] [clang++]: Incorrect warning emited for [-Wfor-loop-analysis]
Issue 132038 Summary [clang++]: Incorrect warning emited for [-Wfor-loop-analysis] Labels clang Assignees Reporter greg7mdp Warning correctly states that the variables are not modified in the loop body, but `height` is incremented nevertheless (in the `incr_height` lambda) and the loop is fine. See example program: ``` // compile with `clang++-20 -std=c++20 -Wall -c loop.cpp` // // ~/tmp ❯ clang++-20 -std=c++20 -Wall -c loop.cpp // loop.cpp:10:35: warning: variables 'height' and 'end_height' used in loop condition not modified in loop body [-Wfor-loop-analysis] //10 |for (uint32_t end_height = 27; height <= end_height; incr_height()) // | ^~~~ // 1 warning generated. // --- #include extern void add_to_expected_table(uint32_t h); void test() { uint32_t height = 0; auto incr_height = [&height]() { ++height; }; for (uint32_t end_height = 27; height <= end_height; incr_height()) add_to_expected_table(height); } ``` ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 132059] small mistake in -fmodule-file description breaks clang++-19 frontend with -fstack-protector
Issue 132059 Summary small mistake in -fmodule-file description breaks clang++-19 frontend with -fstack-protector Labels clang Assignees Reporter igormcoelho I was experimenting CXX Modules with Clang 19 on Ubuntu, and it suddently broke when I passed wrong parameters on -fmodule-file, but strangely this only occurred when -fstack-protector was enabled... I have no idea why. This is the clang version: ``` $ clang++-19 --version Ubuntu clang version 19.1.1 (1ubuntu1~24.04.2) Target: x86_64-pc-linux-gnu Thread model: posix InstalledDir: /usr/lib/llvm-19/bin ``` The files are: ``` // main.cc import hello; import std; int main() { do_hello("world"); return 0; } ``` ``` // hello.cppm export module hello; import std; export inline void do_hello(std::string_view const &name) { std::cout << "Hello " << name << "!\n"; } ``` First I compile both .pcm files: ``` clang++-19 -std=c++23 -stdlib=libc++ -U_FORTIFY_SOURCE -fstack-protector -Wall -Wthread-safety -Wself-assign -Wunused-but-set-parameter -Wno-free-nonheap-object -fcolor-diagnostics -fno-omit-frame-pointer '-std=c++23' -fPIC -Wno-reserved-identifier -Wno-reserved-module-identifier --precompile -o std.pcm /usr/lib/llvm-19/share/libc++/v1/std.cppm clang++-19 -std=c++23 -stdlib=libc++ -U_FORTIFY_SOURCE -fstack-protector -Wall -Wthread-safety -Wself-assign -Wunused-but-set-parameter -Wno-free-nonheap-object -fcolor-diagnostics -fno-omit-frame-pointer '-std=c++23' -fPIC -fmodule-file=std=std.pcm -Wno-reserved-identifier -Wno-reserved-module-identifier --precompile -o hello.pcm hello.cppm std.pcm clang++-19 -std=c++23 -stdlib=libc++ -U_FORTIFY_SOURCE -fstack-protector -Wall -Wthread-safety -Wself-assign -Wunused-but-set-parameter -Wno-free-nonheap-object -fcolor-diagnostics -fno-omit-frame-pointer '-std=c++23' -fPIC -fmodule-file=std=std.pcm -fmodule-file=hello=hello.pcm -o hello_world std.pcm hello.pcm main.cc ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 132077] llvm::ConstantStruct::get returns a zero initialiser
Issue 132077 Summary llvm::ConstantStruct::get returns a zero initialiser Labels new issue Assignees Reporter aldrinmathew LLVM Version: 19 OS: Ubuntu 24.04 LTS Target Triple: x86_64-unknown-linux-gnu When `llvm::ConstantStruct::get` is called with a named struct type and zero value members, the `llvm::Constant*` returned is automatically a `zeroinitialiser` ```cpp llvm::ConstantStruct::get( llvm::dyn_cast(finalTy->get_llvm_type()), {llvm::ConstantPointerNull::get(llPtrTy), llvm::ConstantInt::get( llvm::Type::getIntNTy( ctx->irCtx->llctx, ctx->irCtx->clangTargetInfo->getTypeWidth( ctx->irCtx->clangTargetInfo->getSizeType() ) ), 0u )} ) ``` I can understand the sentiment behind this. Later in the code when I use `constVal->getAggregateElement(0u)`, it fails saying that the value is not an aggregate. I see two possible solutions: 1. Change the API so that the structural integrity of the constant value is maintained - The values are stored as is. 2. Update `getAggregateElement` to consider this edge case and return a zero value of the element type instead. I don't know if this was already found or fixed. If so, kindly ignore and close the issue. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 132112] [clang-tidy] `modernize-loop-convert` - needs a way to enable reverse pipe syntax
Issue 132112 Summary [clang-tidy] `modernize-loop-convert` - needs a way to enable reverse pipe syntax Labels clang-tidy Assignees Reporter denzor200 The check `modernize-loop-convert` never generates the code with pipe sintax like `for (int i : il | std::views::reverse)` while `modernize-use-ranges` does. Needs an option/options to make it possible. In my opinion we need to add these 3 new options: **UseReversePipe** - When true (default false), fixes which involve reverse ranges will use the pipe adaptor syntax instead of the function syntax ReverseRange **MakeReverseRangePipeAdaptor** - Specify the pipe adaptor used to reverse a range, the adaptor should accept a class with `rbegin` and `rend` methods and return a class with `begin` and `end` methods that call the `rbegin` and `rend` methods respectively. Common examples are `std::views::reverse` and `boost::adaptors::reversed`. Default value is an empty string. **MakeReverseRangeHeader** - Specifies the header file where MakeReverseRangePipeAdaptor is declared. For the previous examples this option would be set to `ranges` and `boost/range/adaptor/reversed.hpp` respectively. If this is an empty string and MakeReverseRangePipeAdaptor is set, the check will take the value from MakeReverseRangeHeader option. This can be wrapped in angle brackets to signify to add the include as a system include. Default value is an empty string. EXAMPLE - without pipe sintax: ``` #include int main() { std::string str = "abcdefghijklmnopqrstuvwxyz"; std::cout << "Reversed abc: "; for (char c : boost::adaptors::reverse(str)) { std::cout << c; } std::cout << std::endl; } ``` Configuration: UseCxx20ReverseRanges = true MakeReverseRangeFunction = "boost::adaptors::reverse" MakeReverseRangeHeader = "boost/range/adaptor/reversed.hpp" UseReversePipe = false MakeReverseRangePipeAdaptor = "" MakeReverseRangeHeader = "" EXAMPLE - with pipe sintax: ``` #include int main() { std::string str = "abcdefghijklmnopqrstuvwxyz"; std::cout << "Reversed abc: "; for (char c : str | boost::adaptors::reversed) { std::cout << c; } std::cout << std::endl; } ``` Configuration: UseCxx20ReverseRanges = true MakeReverseRangeFunction = "boost::adaptors::reverse" MakeReverseRangeHeader = "boost/range/adaptor/reversed.hpp" UseReversePipe = true MakeReverseRangePipeAdaptor = "boost::adaptors::reversed" MakeReverseRangeHeader = "" ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 132139] [libc++] `std::function` with small object optimization copies instead of moves
Issue 132139 Summary [libc++] `std::function` with small object optimization copies instead of moves Labels libc++ Assignees Reporter BenjaminSchaaf When using `std::function` I noticed that *sometimes* the captured data is copied even if the `std::function` is only ever moved. I've tracked this down to the following code: https://github.com/llvm/llvm-project/blob/6003c3055a4630be31cc3d459cdbb88248a007b9/libcxx/include/__functional/function.h#L394 It looks like when the small object optimization is in use for `std::function`, then the captured data is copied instead of moved whenever the function is moved. In extreme cases this could result in massively more work being done. The issue reproduces trivially, see godbolt: https://godbolt.org/z/czTev8jfc Note that the captured data needs to be < `3*sizeof(void *)` and have `noexcept` on its copy constructor to trigger the small object optimization. When that's the case the data is copied instead of moved. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 132001] `clang-analyzer-optin.cplusplus.UninitializedObject` false positive with unnamed fields
Issue 132001 Summary `clang-analyzer-optin.cplusplus.UninitializedObject` false positive with unnamed fields Labels clang:static analyzer, false-positive Assignees Reporter firewave ```cpp struct S { S(bool b) : b(b) {} bool b{false}; long long : 7; // padding }; void f() { S s(true); } ``` ``` :4:9: warning: 1 uninitialized field at the end of the constructor call [clang-analyzer-optin.cplusplus.UninitializedObject] 4 | : b(b) | ^ :7:15: note: uninitialized field 'this->' 7 | long long : 7; // padding | ^ :12:7: note: Calling constructor for 'S' 12 | S s(true); | ^~~ :4:9: note: 1 uninitialized field at the end of the constructor call 4 | : b(b) | ^ ``` https://godbolt.org/z/7zzoK97x5 ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 132010] `clang-analyzer-alpha.cplusplus.MismatchedIterator` false positive with container insertion
Issue 132010 Summary `clang-analyzer-alpha.cplusplus.MismatchedIterator` false positive with container insertion Labels clang:static analyzer, false-positive Assignees Reporter firewave ```cpp #include #include void f() { std::list l; std::unordered_set us; us.insert(l.cbegin(), l.cend()); } ``` ``` :8:5: warning: Container accessed using foreign iterator argument [clang-analyzer-alpha.cplusplus.MismatchedIterator] 8 | us.insert(l.cbegin(), l.cend()); | ^~~ :8:5: note: Container accessed using foreign iterator argument 8 | us.insert(l.cbegin(), l.cend()); | ^~~ ``` https://godbolt.org/z/arhEMh6Go ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 131986] Provide updated clang-19 release
Issue 131986 Summary Provide updated clang-19 release Labels new issue Assignees Reporter mpusz It seems that [APT releases](https://releases.llvm.org/) are far behind the LLVM official release. The last available from the 19 stream is clang-19.1. This release had bugs that were fixed but never released. This means that, for example, my project [mp-units](https://mpusz.github.io/mp-units/latest/) can't be compiled on most Ubuntu machines using clang-19 and also does not compile on the Compiler Explorer. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 132013] `clang-18.1.8` crash in "Early Machine Loop Invariant Code Motion"
Issue 132013 Summary `clang-18.1.8` crash in "Early Machine Loop Invariant Code Motion" Labels new issue Assignees Reporter gonnet Hi all, I get the following compiler crash: ``` Stack dump: 0. Program arguments: /usr/lib/llvm-18/bin/clang -U_FORTIFY_SOURCE -fstack-protector -Wall -Wthread-safety -Wself-assign -Wunused-but-set-parameter -Wno-free-nonheap-object -fcolor-diagnostics -fno-omit-frame-pointer -g0 -O2 -D_FORTIFY_SOURCE=1 -DNDEBUG -ffunction-sections -fdata-sections -MD -MF bazel-out/k8-opt/bin/_objs/avx512fp16_prod_microkernels/f16-vsin-avx512fp16-rational-3-2-div.d -frandom-seed=bazel-out/k8-opt/bin/_objs/avx512fp16_prod_microkernels/f16-vsin-avx512fp16-rational-3-2-div.o -DPTHREADPOOL_NO_DEPRECATED_API -DXNN_LOG_LEVEL=0 -DXNN_ENABLE_CPUINFO=1 -DXNN_ENABLE_MEMOPT=1 -DXNN_ENABLE_SPARSE=1 -DXNN_ENABLE_ASSEMBLY=1 -DXNN_ENABLE_ARM_FP16_SCALAR=0 -DXNN_ENABLE_ARM_FP16_VECTOR=0 -DXNN_ENABLE_ARM_BF16=0 -DXNN_ENABLE_ARM_DOTPROD=0 -DXNN_ENABLE_ARM_I8MM=0 -DXNN_ENABLE_RISCV_FP16_VECTOR=0 -DXNN_ENABLE_AVX512AMX=1 -DXNN_ENABLE_AVX512FP16=1 -DXNN_ENABLE_AVX512BF16=1 -DXNN_ENABLE_AVXVNNI=1 -DXNN_ENABLE_AVXVNNIINT8=1 -DXNN_ENABLE_AVX512F=1 -DXNN_ENABLE_AVX256SKX=1 -DXNN_ENABLE_AVX256VNNI=1 -DXNN_ENABLE_AVX256VNNIGFNI=1 -DXNN_ENABLE_AVX512SKX=1 -DXNN_ENABLE_AVX512VBMI=1 -DXNN_ENABLE_AVX512VNNI=1 -DXNN_ENABLE_AVX512VNNIGFNI=1 -DXNN_ENABLE_HVX=0 -DXNN_ENABLE_KLEIDIAI=0 -DXNN_ENABLE_SRM_SME=0 -DXNN_ENABLE_ARM_SME2=0 -DXNN_ENABLE_WASM_REVECTORIZE=0 -iquote . -iquote bazel-out/k8-opt/bin -iquote external/+_repo_rules+pthreadpool -iquote bazel-out/k8-opt/bin/external/+_repo_rules+pthreadpool -iquote external/+_repo_rules+FXdiv -iquote bazel-out/k8-opt/bin/external/+_repo_rules+FXdiv -iquote external/+_repo_rules+cpuinfo -iquote bazel-out/k8-opt/bin/external/+_repo_rules+cpuinfo -Ibazel-out/k8-opt/bin/external/+_repo_rules+pthreadpool/_virtual_includes/pthreadpool -Ibazel-out/k8-opt/bin/external/+_repo_rules+FXdiv/_virtual_includes/FXdiv -Ibazel-out/k8-opt/bin/external/+_repo_rules+cpuinfo/_virtual_includes/cpuinfo -isystem include -isystem bazel-out/k8-opt/bin/include -isystem src -isystem bazel-out/k8-opt/bin/src -isystem external/+_repo_rules+pthreadpool/include -isystem bazel-out/k8-opt/bin/external/+_repo_rules+pthreadpool/include -isystem external/+_repo_rules+FXdiv/include -isystem bazel-out/k8-opt/bin/external/+_repo_rules+FXdiv/include -isystem external/+_repo_rules+cpuinfo/include -isystem bazel-out/k8-opt/bin/external/+_repo_rules+cpuinfo/include -isystem external/+_repo_rules+cpuinfo/src -isystem bazel-out/k8-opt/bin/external/+_repo_rules+cpuinfo/src -Wno-unused-but-set-variable -mstack-alignment=64 -fomit-frame-pointer -mstackrealign -mf16c -mfma -mavx512f -mavx512cd -mavx512bw -mavx512dq -mavx512vl -mavx512vnni -mgfni -mavx512fp16 -std=c99 -O2 -c src/f16-vsin/gen/f16-vsin-avx512fp16-rational-3-2-div.c -o bazel-out/k8-opt/bin/_objs/avx512fp16_prod_microkernels/f16-vsin-avx512fp16-rational-3-2-div.o -no-canonical-prefixes -Wno-builtin-macro-redefined -D__DATE__=\"redacted\" -D__TIMESTAMP__=\"redacted\" -D__TIME__=\"redacted\" 1. parser at end of file 2. Code generation 3. Running pass 'Function Pass Manager' on module 'src/f16-vsin/gen/f16-vsin-avx512fp16-rational-3-2-div.c'. 4. Running pass 'Early Machine Loop Invariant Code Motion' on function '@xnn_f16_vsin_ukernel__avx512fp16_rational_3_2_div_u32' ``` ``` $ /usr/lib/llvm-18/bin/clang --version Debian clang version 18.1.8 (16) Target: x86_64-pc-linux-gnu Thread model: posix InstalledDir: /usr/lib/llvm-18/bin ``` I've attached the pre-processed source file generated with the same arguments as the command above. [f16-vsin-avx512fp16-rational-3-2-div.c.gz](https://github.com/user-attachments/files/19340278/f16-vsin-avx512fp16-rational-3-2-div.c.gz) I tried reproducing with version `19.1.7`, but that doesn't fail. Pending a fix, is there anything I can do to work around this issue? Cheers, Pedro ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 132024] lldb-server fails to connect on version 20.1+
Issue 132024 Summary lldb-server fails to connect on version 20.1+ Labels new issue Assignees Reporter azais-corentin Hello, Since LLVM 20, I'm not able to use remote debugging anymore. I start the server on one terminal: ``` lldb-server-20 platform --server --listen *:1234 --min-gdbserver-port 31400 --max-gdbserver-port 31500 ``` Then when I connect, I get the following errors: ``` > lldb-20 > (lldb) platform select remote-linux > (lldb) platform connect connect://127.0.0.1:1234 > error: spawn_process failed: execve failed: No such file or directory > error: Connection shut down by remote side while waiting for reply to initial handshake packet ``` The exact same steps on LLVM 19 work without any errors ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 132055] Code that uses a libcall marks internal symbol as exported when making object file
Issue 132055 Summary Code that uses a libcall marks internal symbol as exported when making object file Labels new issue Assignees Reporter gbaraldi This is showing up as https://github.com/JuliaLang/julia/pull/57658#issuecomment-2727385403 which is holding up julia being able to compile with LLVM20. The issue is that a defining a libcall in a module and then having it be used is making the libcall symbol become external ```llvm source_filename = "text" target datalayout = "e-m:o-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128" target triple = "x86_64-apple-macosx10.14.0-macho" declare i16 @julia_float_to_half(float) define internal i16 @__truncsfhf2(float %0) { %2 = call i16 @julia_float_to_half(float %0) ret i16 %2 } define hidden swiftcc half @julia_fp(float %0) { %2 = fptrunc float %0 to half ret half %2 } @llvm.compiler.used = appending global [1 x ptr] [ptr @__truncsfhf2], section "llvm.metadata" ``` ``` T ___truncsfhf2 U _julia_float_to_half 0010 T _julia_fp usr/lib/julia on debug-llvm-20 [?] ❯ llc lala.ll -filetype=obj -o llvm19.o usr/lib/julia on debug-llvm-20 [?] ❯ nm llvm19.o t ___truncsfhf2 U _julia_float_to_half 0010 T _julia_fp ``` Not sure if this is an intentional change or not but it's between LLVM 19 and 20 ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 132065] Potential OOB access in getMaskVecValue in CGBuiltin.cpp
Issue 132065 Summary Potential OOB access in getMaskVecValue in CGBuiltin.cpp Labels new issue Assignees Reporter jurahul This function in CGBuiltin.cpp has this code: ``` if (NumElts < 8) { int Indices[4]; for (unsigned i = 0; i != NumElts; ++i) Indices[i] = i; ``` if NulElts > 4, the Indices arrays will be accessed OOB. Might be as simple as bumping the size to 8. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 132069] [ms] [llvm-ml] Support for `NEAR` in `EXTERN` directives
Issue 132069 Summary [ms] [llvm-ml] Support for `NEAR` in `EXTERN` directives Labels new issue Assignees Reporter MisterDA ```console $ cat > test.asm < Type may be any of `near`, `far`, `proc`, `byte`, `word`, `dword`, `qword`, `tbyte`, `abs` (absolute, which is a constant), or some other user defined type. I'm interested in support for `near`. Cf #131707 cc @ericastor ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 132074] [ms] [llvm-ml] Support macros with parenthesis and commas delimiters between arguments
Issue 132074 Summary [ms] [llvm-ml] Support macros with parenthesis and commas delimiters between arguments Labels new issue Assignees Reporter MisterDA I have a MASM file that has a macro that I call with parenthesis and which uses a comma as a delimiter between arguments, as parameters may contain spaces: ```asm i = 0 SubstitutionMacro MACRO _type:REQ, name:REQ field_&name EQU i i = i + 1 EXITM <> ENDM SubstitutionMacro(int, a) SubstitutionMacro(int*, b) SubstitutionMacro(struct type, c) SubstitutionMacro(struct type*, d) .CODE mov r12, field_a mov r13, field_b mov r14, field_c mov r15, field_d END ``` `llvm-ml` currently fails at assembling this file: ``` console $ ml64 -nologo -Cp -c -Fo test.obj test.asm Assembling: test.asm $ objdump -D --section='.text$mn' test.obj test.obj: file format pe-x86-64 Disassembly of section .text$mn: <.text$mn>: 0: 49 c7 c4 00 00 00 00mov$0x0,%r12 7: 49 c7 c5 01 00 00 00mov$0x1,%r13 e: 49 c7 c6 02 00 00 00mov $0x2,%r14 15: 49 c7 c7 03 00 00 00mov$0x3,%r15 ``` Cf #129905 cc @ericastor ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 132071] [RISC-V] Miscompile on rv64gcv with -O[23]
Issue 132071 Summary [RISC-V] Miscompile on rv64gcv with -O[23] Labels new issue Assignees Reporter ewlu Testcase: ```c short a[12]; short b[12]; long long al; int f; short m = 31554; long long q[12]; int main() { for (int i = 0; i < 2; ++i) q[i] = 6; for (short i = 0; i < 12; i += m - 31553) { a[i] = q[i]; b[i] = f > q[i]; } for (int i = 0; i < 12; ++i) al += a[i]; for (int i = 0; i < 2; ++i) al += b[i]; __builtin_printf("%llu\n", al); } ``` Commands: ``` # riscv $ ./bin/clang -march=rv64gcv_zvl256b -flto -O2 red.c -o user-config.out $ QEMU_CPU=rv64,vlen=256,rvv_ta_all_1s=true,rvv_ma_all_1s=true,v=true,vext_spec=v1.0,zve32f=true,zve64f=true timeout --verbose -k 0.1 4 ./bin/qemu-riscv64 user-config.out 1 10 $ ./bin/clang -march=rv64gcv_zvl256b -flto -O3 red.c -o user-config.out $ QEMU_CPU=rv64,vlen=256,rvv_ta_all_1s=true,rvv_ma_all_1s=true,v=true,vext_spec=v1.0,zve32f=true,zve64f=true timeout --verbose -k 0.1 4 ./bin/qemu-riscv64 user-config.out 1 10 # x86 $ ./native.out 1 12 ``` Godbolt: https://godbolt.org/z/7hMW6eerT Bisected to 9b7282e545d5e47315e3ffb9e5e00d0fb547c8e3 as the first bad commit Found via fuzzer ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 132084] [ms] [llvm-ml] Allow size directives on register operands
Issue 132084 Summary [ms] [llvm-ml] Allow size directives on register operands Labels new issue Assignees Reporter MisterDA ```asm .CODE mov rsp, r11 mov qword ptr rsp, r11 mov rsp, qword ptr r11 mov qword ptr rsp, qword ptr r11 END ``` ```console $ ml64 -nologo -Cp -c -Fo test.obj -W3 test.asm Assembling: test.asm $ objdump -D --section='.text$mn' test.obj test.obj: file format pe-x86-64 Disassembly of section .text$mn: <.text$mn>: 0: 49 8b e3mov%r11,%rsp 3: 49 8b e3mov%r11,%rsp 6: 49 8b e3mov %r11,%rsp 9: 49 8b e3mov%r11,%rsp $ llvm-ml -m64 -nologo -c -Fo test.obj test.asm test.asm:3:23: error: expected memory operand after 'ptr', found register operand instead mov qword ptr rsp, r11 ^ test.asm:4:28: error: expected memory operand after 'ptr', found register operand instead mov rsp, qword ptr r11 ^ test.asm:5:23: error: expected memory operand after 'ptr', found register operand instead mov qword ptr rsp, qword ptr r11 ^ ``` Not sure if the size directive is needed here or not, but LLVM exits with a error. Microsoft's ml64 accepts the move register-to-register with a size directive. cc @ericastor ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 132020] Can `-fmemory-profile-use` be used together with `-fmemory-profile`?
Issue 132020 Summary Can `-fmemory-profile-use` be used together with `-fmemory-profile`? Labels new issue Assignees Reporter zcfh ## Background I am currently trying to use memprof to optimize my program, but the test did not find obvious performance gains. **I now hope to develop a tool to observe whether the optimization is really effective.** ## Imagination My idea is to use memory-profile and memory-profile-use at the same time, so that the memory access after memprof optimization can be collected. Then do some post-processing on this information to determine whether the correct clone is made. Furthermore, can a heat map be drawn based on the memory access situation, similar to BOLT. ## Question 1. I made the first attempt and found that `-fmemory-profile` and `-fmemory-profile-use` cannot be used at the same time. Is this not allowed by design? 2. **Are there other observation tools?** I looked through some options and only found `memprof-export-to-dot`, but this can only prove whether the clone is done, and cannot reflect whether the clone result is good enough. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 132068] CUDA/HIP: lambda capture of constexpr variable inconsistent between host and device
Issue 132068 Summary CUDA/HIP: lambda capture of constexpr variable inconsistent between host and device Labels Assignees Reporter mkuron Consider the following bit of HIP code: ```c++ #include using std::max; template static void __global__ kernel(F f) { f(1); } void test(float const * fl, float const * A, float * Vf) { float constexpr small(1.0e-25); auto f = [=] __device__ __host__ (unsigned int n) { float const value = max(small, fl[0]); Vf[0] = value * A[0]; }; static_assert(sizeof(f) == sizeof(fl) + sizeof(A) + sizeof(Vf)); kernel<<<1,1>>>(f); } ``` The `static_assert` fails in the host-side compilation but succeeds in the device-side compilation. This means that the layout of the struct synthesized from the lambda is inconsistent between host and device, so if you use any of the captured variables on the device side, they will contain the data of some of the other variables. You can also use `-Xclang -fdump-record-layouts` to see that. Evidently the `constexpr` variable is part of the captured variables only on the host side, but not on the device side. With `--cuda-host-only`: ``` *** Dumping AST Record Layout 0 | class (lambda at :23:12) 0 | const float * 8 | const float 16 | float * 24 | const float * | [sizeof=32, dsize=32, align=8, | nvsize=32, nvalign=8] ``` With `--cuda-device-only`: ``` *** Dumping AST Record Layout 0 | class (lambda at :23:12) 0 | const float * 8 | float * 16 | const float * | [sizeof=24, dsize=24, align=8, | nvsize=24, nvalign=8] ``` Godbolt: https://cuda.godbolt.org/z/KE789sevs. When you compile the exact same code for CUDA, this does not happen. However, if you add ```c++ template ::value, int> = 0> __host__ T max(const T a, const T b) { return std::max(a, b); } ``` after line 3 of the code at the top, you get the exact same layout discrepancy as with HIP. See https://cuda.godbolt.org/z/e3Ybr4hK1. I can replace the `[=]` with `[fl, A, Vf]` and if that `__host__ T max` overload is present, it tells me that _variable 'small' cannot be implicitly captured in a lambda with no capture-default specified_, but if I leave out that overload it does not show that error message. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs