[llvm-bugs] [Bug 132038] [clang++]: Incorrect warning emited for [-Wfor-loop-analysis]

2025-03-19 Thread LLVM Bugs via llvm-bugs


Issue

132038




Summary

[clang++]: Incorrect warning emited for [-Wfor-loop-analysis]




  Labels
  
clang
  



  Assignees
  
  



  Reporter
  
  greg7mdp
  




Warning correctly states that the variables are not modified in the loop body, but `height` is incremented nevertheless (in the  `incr_height` lambda) and the loop is fine.

See example program:

```
// compile with `clang++-20 -std=c++20 -Wall -c loop.cpp`
//
// ~/tmp ❯ clang++-20 -std=c++20 -Wall -c loop.cpp
// loop.cpp:10:35: warning: variables 'height' and 'end_height' used in loop condition not modified in loop body [-Wfor-loop-analysis]
//10 |for (uint32_t end_height = 27; height <= end_height; incr_height())
//   |   ^~~~
// 1 warning generated.
// ---
#include 

extern void add_to_expected_table(uint32_t h);

void test() {
   uint32_t height = 0;

   auto incr_height = [&height]() { ++height; };

   for (uint32_t end_height = 27; height <= end_height; incr_height())
  add_to_expected_table(height);
}
```



___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 132059] small mistake in -fmodule-file description breaks clang++-19 frontend with -fstack-protector

2025-03-19 Thread LLVM Bugs via llvm-bugs


Issue

132059




Summary

small mistake in -fmodule-file description breaks clang++-19 frontend with -fstack-protector




  Labels
  
clang
  



  Assignees
  
  



  Reporter
  
  igormcoelho
  




I was experimenting CXX Modules with Clang 19 on Ubuntu, and it suddently broke when I passed wrong parameters on -fmodule-file, but strangely this only occurred when -fstack-protector was enabled... I have no idea why.
This is the clang version:
```
$ clang++-19 --version
Ubuntu clang version 19.1.1 (1ubuntu1~24.04.2)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/lib/llvm-19/bin
```

The files are:
```
// main.cc
import hello;
import std;

int main() {
  do_hello("world");
 return 0;
}
```

```
// hello.cppm
export module hello;
import std;

export inline void do_hello(std::string_view const &name)
{
 std::cout << "Hello " << name << "!\n";
}
```

First I compile both .pcm files:
```
clang++-19 -std=c++23 -stdlib=libc++ -U_FORTIFY_SOURCE -fstack-protector -Wall -Wthread-safety -Wself-assign -Wunused-but-set-parameter -Wno-free-nonheap-object -fcolor-diagnostics -fno-omit-frame-pointer '-std=c++23' -fPIC -Wno-reserved-identifier -Wno-reserved-module-identifier --precompile -o std.pcm /usr/lib/llvm-19/share/libc++/v1/std.cppm
clang++-19 -std=c++23 -stdlib=libc++ -U_FORTIFY_SOURCE -fstack-protector -Wall -Wthread-safety -Wself-assign -Wunused-but-set-parameter -Wno-free-nonheap-object -fcolor-diagnostics -fno-omit-frame-pointer '-std=c++23' -fPIC -fmodule-file=std=std.pcm -Wno-reserved-identifier -Wno-reserved-module-identifier --precompile -o hello.pcm hello.cppm std.pcm

clang++-19 -std=c++23 -stdlib=libc++ -U_FORTIFY_SOURCE -fstack-protector -Wall -Wthread-safety -Wself-assign -Wunused-but-set-parameter -Wno-free-nonheap-object -fcolor-diagnostics -fno-omit-frame-pointer '-std=c++23' -fPIC -fmodule-file=std=std.pcm -fmodule-file=hello=hello.pcm  -o hello_world std.pcm hello.pcm main.cc



___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 132077] llvm::ConstantStruct::get returns a zero initialiser

2025-03-19 Thread LLVM Bugs via llvm-bugs


Issue

132077




Summary

llvm::ConstantStruct::get returns a zero initialiser




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  aldrinmathew
  




LLVM Version: 19
OS: Ubuntu 24.04 LTS
Target Triple: x86_64-unknown-linux-gnu

When `llvm::ConstantStruct::get` is called with a named struct type and zero value members, the `llvm::Constant*` returned is automatically a `zeroinitialiser`

```cpp
llvm::ConstantStruct::get(
 llvm::dyn_cast(finalTy->get_llvm_type()),
 {llvm::ConstantPointerNull::get(llPtrTy),
 llvm::ConstantInt::get(
 llvm::Type::getIntNTy(
 ctx->irCtx->llctx,
 ctx->irCtx->clangTargetInfo->getTypeWidth(
 ctx->irCtx->clangTargetInfo->getSizeType()
 )
 ),
 0u
 )}
)
```

I can understand the sentiment behind this.

Later in the code when I use `constVal->getAggregateElement(0u)`, it fails saying that the value is not an aggregate.

I see two possible solutions:

1. Change the API so that the structural integrity of the constant value is maintained - The values are stored as is.
2. Update `getAggregateElement` to consider this edge case and return a zero value of the element type instead.

I don't know if this was already found or fixed. If so, kindly ignore and close the issue.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 132112] [clang-tidy] `modernize-loop-convert` - needs a way to enable reverse pipe syntax

2025-03-19 Thread LLVM Bugs via llvm-bugs


Issue

132112




Summary

[clang-tidy] `modernize-loop-convert` - needs a way to enable reverse pipe syntax




  Labels
  
clang-tidy
  



  Assignees
  
  



  Reporter
  
  denzor200
  





The check `modernize-loop-convert` never generates the code with pipe sintax like `for (int i : il | std::views::reverse)` while `modernize-use-ranges` does.
Needs an option/options to make it possible.

In my opinion we need to add these 3 new options:
**UseReversePipe** - When true (default false), fixes which involve reverse ranges will use the pipe adaptor syntax instead of the function syntax
ReverseRange
**MakeReverseRangePipeAdaptor** - Specify the pipe adaptor used to reverse a range, the adaptor should accept a class with `rbegin` and `rend` methods and return a class with `begin` and `end` methods that call the `rbegin` and `rend` methods respectively. Common examples are `std::views::reverse` and `boost::adaptors::reversed`. Default value is an empty string.
**MakeReverseRangeHeader** - Specifies the header file where MakeReverseRangePipeAdaptor is declared. For the previous examples this option would be set to `ranges` and `boost/range/adaptor/reversed.hpp` respectively. If this is an empty string and MakeReverseRangePipeAdaptor is set, the check will take the value from MakeReverseRangeHeader option. This can be wrapped in angle brackets to signify to add the include as a system include. Default value is an empty string.

EXAMPLE - without pipe sintax:
```
#include 

int main() {
 std::string str = "abcdefghijklmnopqrstuvwxyz";

std::cout << "Reversed abc: ";
for (char c : boost::adaptors::reverse(str)) {
 std::cout << c;
}
std::cout << std::endl;
}
```
Configuration:
UseCxx20ReverseRanges = true
MakeReverseRangeFunction = "boost::adaptors::reverse"
MakeReverseRangeHeader = "boost/range/adaptor/reversed.hpp"
UseReversePipe = false
MakeReverseRangePipeAdaptor = ""
MakeReverseRangeHeader = ""



EXAMPLE - with pipe sintax:
```
#include 

int main() {
std::string str = "abcdefghijklmnopqrstuvwxyz";

std::cout << "Reversed abc: ";
 for (char c : str | boost::adaptors::reversed) {
std::cout << c;
 }
std::cout << std::endl;
}
```
Configuration:
UseCxx20ReverseRanges = true
MakeReverseRangeFunction = "boost::adaptors::reverse"
MakeReverseRangeHeader = "boost/range/adaptor/reversed.hpp"
UseReversePipe = true
MakeReverseRangePipeAdaptor = "boost::adaptors::reversed"
MakeReverseRangeHeader = ""



___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 132139] [libc++] `std::function` with small object optimization copies instead of moves

2025-03-19 Thread LLVM Bugs via llvm-bugs


Issue

132139




Summary

[libc++] `std::function` with small object optimization copies instead of moves




  Labels
  
libc++
  



  Assignees
  
  



  Reporter
  
  BenjaminSchaaf
  




When using `std::function` I noticed that *sometimes* the captured data is copied even if the `std::function` is only ever moved.

I've tracked this down to the following code:
https://github.com/llvm/llvm-project/blob/6003c3055a4630be31cc3d459cdbb88248a007b9/libcxx/include/__functional/function.h#L394

It looks like when the small object optimization is in use for `std::function`, then the captured data is copied instead of moved whenever the function is moved. In extreme cases this could result in massively more work being done.

The issue reproduces trivially, see godbolt: https://godbolt.org/z/czTev8jfc

Note that the captured data needs to be < `3*sizeof(void *)` and have `noexcept` on its copy constructor to trigger the small object optimization. When that's the case the data is copied instead of moved.



___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 132001] `clang-analyzer-optin.cplusplus.UninitializedObject` false positive with unnamed fields

2025-03-19 Thread LLVM Bugs via llvm-bugs


Issue

132001




Summary

`clang-analyzer-optin.cplusplus.UninitializedObject` false positive with unnamed fields




  Labels
  
clang:static analyzer,
false-positive
  



  Assignees
  
  



  Reporter
  
  firewave
  




```cpp
struct S
{
 S(bool b)
: b(b)
{}
bool b{false};
long long : 7; // padding
};

void f()
{
S s(true);
}
```

```
:4:9: warning: 1 uninitialized field at the end of the constructor call [clang-analyzer-optin.cplusplus.UninitializedObject]
4 | : b(b)
 | ^
:7:15: note: uninitialized field 'this->'
7 | long long : 7; // padding
  |   ^
:12:7: note: Calling constructor for 'S'
   12 | S s(true);
  | ^~~
:4:9: note: 1 uninitialized field at the end of the constructor call
4 | : b(b)
  | ^
```
https://godbolt.org/z/7zzoK97x5


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 132010] `clang-analyzer-alpha.cplusplus.MismatchedIterator` false positive with container insertion

2025-03-19 Thread LLVM Bugs via llvm-bugs


Issue

132010




Summary

`clang-analyzer-alpha.cplusplus.MismatchedIterator` false positive with container insertion




  Labels
  
clang:static analyzer,
false-positive
  



  Assignees
  
  



  Reporter
  
  firewave
  




```cpp
#include 
#include 

void f()
{
std::list l;
 std::unordered_set us;
us.insert(l.cbegin(), l.cend());
}
```

```
:8:5: warning: Container accessed using foreign iterator argument [clang-analyzer-alpha.cplusplus.MismatchedIterator]
8 | us.insert(l.cbegin(), l.cend());
  | ^~~
:8:5: note: Container accessed using foreign iterator argument
8 | us.insert(l.cbegin(), l.cend());
 | ^~~
```
https://godbolt.org/z/arhEMh6Go


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 131986] Provide updated clang-19 release

2025-03-19 Thread LLVM Bugs via llvm-bugs


Issue

131986




Summary

Provide updated clang-19 release




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  mpusz
  




It seems that [APT releases](https://releases.llvm.org/) are far behind the LLVM official release. The last available from the 19 stream is clang-19.1. This release had bugs that were fixed but never released. This means that, for example, my project [mp-units](https://mpusz.github.io/mp-units/latest/) can't be compiled on most Ubuntu machines using clang-19 and also does not compile on the Compiler Explorer.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 132013] `clang-18.1.8` crash in "Early Machine Loop Invariant Code Motion"

2025-03-19 Thread LLVM Bugs via llvm-bugs


Issue

132013




Summary

`clang-18.1.8` crash in "Early Machine Loop Invariant Code Motion"




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  gonnet
  




Hi all,

I get the following compiler crash:

```
Stack dump:
0.	Program arguments: /usr/lib/llvm-18/bin/clang -U_FORTIFY_SOURCE -fstack-protector -Wall -Wthread-safety -Wself-assign -Wunused-but-set-parameter -Wno-free-nonheap-object -fcolor-diagnostics -fno-omit-frame-pointer -g0 -O2 -D_FORTIFY_SOURCE=1 -DNDEBUG -ffunction-sections -fdata-sections -MD -MF bazel-out/k8-opt/bin/_objs/avx512fp16_prod_microkernels/f16-vsin-avx512fp16-rational-3-2-div.d -frandom-seed=bazel-out/k8-opt/bin/_objs/avx512fp16_prod_microkernels/f16-vsin-avx512fp16-rational-3-2-div.o -DPTHREADPOOL_NO_DEPRECATED_API -DXNN_LOG_LEVEL=0 -DXNN_ENABLE_CPUINFO=1 -DXNN_ENABLE_MEMOPT=1 -DXNN_ENABLE_SPARSE=1 -DXNN_ENABLE_ASSEMBLY=1 -DXNN_ENABLE_ARM_FP16_SCALAR=0 -DXNN_ENABLE_ARM_FP16_VECTOR=0 -DXNN_ENABLE_ARM_BF16=0 -DXNN_ENABLE_ARM_DOTPROD=0 -DXNN_ENABLE_ARM_I8MM=0 -DXNN_ENABLE_RISCV_FP16_VECTOR=0 -DXNN_ENABLE_AVX512AMX=1 -DXNN_ENABLE_AVX512FP16=1 -DXNN_ENABLE_AVX512BF16=1 -DXNN_ENABLE_AVXVNNI=1 -DXNN_ENABLE_AVXVNNIINT8=1 -DXNN_ENABLE_AVX512F=1 -DXNN_ENABLE_AVX256SKX=1 -DXNN_ENABLE_AVX256VNNI=1 -DXNN_ENABLE_AVX256VNNIGFNI=1 -DXNN_ENABLE_AVX512SKX=1 -DXNN_ENABLE_AVX512VBMI=1 -DXNN_ENABLE_AVX512VNNI=1 -DXNN_ENABLE_AVX512VNNIGFNI=1 -DXNN_ENABLE_HVX=0 -DXNN_ENABLE_KLEIDIAI=0 -DXNN_ENABLE_SRM_SME=0 -DXNN_ENABLE_ARM_SME2=0 -DXNN_ENABLE_WASM_REVECTORIZE=0 -iquote . -iquote bazel-out/k8-opt/bin -iquote external/+_repo_rules+pthreadpool -iquote bazel-out/k8-opt/bin/external/+_repo_rules+pthreadpool -iquote external/+_repo_rules+FXdiv -iquote bazel-out/k8-opt/bin/external/+_repo_rules+FXdiv -iquote external/+_repo_rules+cpuinfo -iquote bazel-out/k8-opt/bin/external/+_repo_rules+cpuinfo -Ibazel-out/k8-opt/bin/external/+_repo_rules+pthreadpool/_virtual_includes/pthreadpool -Ibazel-out/k8-opt/bin/external/+_repo_rules+FXdiv/_virtual_includes/FXdiv -Ibazel-out/k8-opt/bin/external/+_repo_rules+cpuinfo/_virtual_includes/cpuinfo -isystem include -isystem bazel-out/k8-opt/bin/include -isystem src -isystem bazel-out/k8-opt/bin/src -isystem external/+_repo_rules+pthreadpool/include -isystem bazel-out/k8-opt/bin/external/+_repo_rules+pthreadpool/include -isystem external/+_repo_rules+FXdiv/include -isystem bazel-out/k8-opt/bin/external/+_repo_rules+FXdiv/include -isystem external/+_repo_rules+cpuinfo/include -isystem bazel-out/k8-opt/bin/external/+_repo_rules+cpuinfo/include -isystem external/+_repo_rules+cpuinfo/src -isystem bazel-out/k8-opt/bin/external/+_repo_rules+cpuinfo/src -Wno-unused-but-set-variable -mstack-alignment=64 -fomit-frame-pointer -mstackrealign -mf16c -mfma -mavx512f -mavx512cd -mavx512bw -mavx512dq -mavx512vl -mavx512vnni -mgfni -mavx512fp16 -std=c99 -O2 -c src/f16-vsin/gen/f16-vsin-avx512fp16-rational-3-2-div.c -o bazel-out/k8-opt/bin/_objs/avx512fp16_prod_microkernels/f16-vsin-avx512fp16-rational-3-2-div.o -no-canonical-prefixes -Wno-builtin-macro-redefined -D__DATE__=\"redacted\" -D__TIMESTAMP__=\"redacted\" -D__TIME__=\"redacted\"
1.	 parser at end of file
2.	Code generation
3.	Running pass 'Function Pass Manager' on module 'src/f16-vsin/gen/f16-vsin-avx512fp16-rational-3-2-div.c'.
4.	Running pass 'Early Machine Loop Invariant Code Motion' on function '@xnn_f16_vsin_ukernel__avx512fp16_rational_3_2_div_u32'
```
```
$ /usr/lib/llvm-18/bin/clang --version
Debian clang version 18.1.8 (16)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/lib/llvm-18/bin
```
I've attached the pre-processed source file generated with the same arguments as the command above.

[f16-vsin-avx512fp16-rational-3-2-div.c.gz](https://github.com/user-attachments/files/19340278/f16-vsin-avx512fp16-rational-3-2-div.c.gz)

I tried reproducing with version `19.1.7`, but that doesn't fail.

Pending a fix, is there anything I can do to work around this issue?

Cheers, Pedro


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 132024] lldb-server fails to connect on version 20.1+

2025-03-19 Thread LLVM Bugs via llvm-bugs


Issue

132024




Summary

lldb-server fails to connect on version 20.1+




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  azais-corentin
  




Hello,

Since LLVM 20, I'm not able to use remote debugging anymore.

I start the server on one terminal:
```
lldb-server-20 platform --server --listen *:1234 --min-gdbserver-port 31400 --max-gdbserver-port 31500
```

Then when I connect, I get the following errors:

```
> lldb-20
> (lldb) platform select remote-linux
> (lldb) platform connect connect://127.0.0.1:1234
> error: spawn_process failed: execve failed: No such file or directory
> error: Connection shut down by remote side while waiting for reply to initial handshake packet
```

The exact same steps on LLVM 19 work without any errors


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 132055] Code that uses a libcall marks internal symbol as exported when making object file

2025-03-19 Thread LLVM Bugs via llvm-bugs


Issue

132055




Summary

Code that uses a libcall marks internal symbol as exported when making object file




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  gbaraldi
  




This is showing up as https://github.com/JuliaLang/julia/pull/57658#issuecomment-2727385403 which is holding up julia being able to compile with LLVM20.

The issue is that a defining a libcall in a module and then having it be used is making the libcall symbol become external
```llvm
source_filename = "text"
target datalayout = "e-m:o-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-apple-macosx10.14.0-macho"

declare i16 @julia_float_to_half(float)

define internal i16 @__truncsfhf2(float %0) {
  %2 = call i16 @julia_float_to_half(float %0)
  ret i16 %2
}

define hidden swiftcc half @julia_fp(float %0) {
  %2 = fptrunc float %0 to half
  ret half %2
}

@llvm.compiler.used = appending global [1 x ptr] [ptr @__truncsfhf2], section "llvm.metadata"
```

```
 T ___truncsfhf2
 U _julia_float_to_half
0010 T _julia_fp
usr/lib/julia on  debug-llvm-20 [?] 
❯ llc lala.ll -filetype=obj -o llvm19.o
usr/lib/julia on  debug-llvm-20 [?] 
❯ nm llvm19.o
 t ___truncsfhf2
 U _julia_float_to_half
0010 T _julia_fp
```
Not sure if this is an intentional change or not but it's between LLVM 19 and 20


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 132065] Potential OOB access in getMaskVecValue in CGBuiltin.cpp

2025-03-19 Thread LLVM Bugs via llvm-bugs


Issue

132065




Summary

Potential OOB access in getMaskVecValue in CGBuiltin.cpp




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  jurahul
  




This function in CGBuiltin.cpp has this code:

```
  if (NumElts < 8) {
int Indices[4];
for (unsigned i = 0; i != NumElts; ++i)
  Indices[i] = i;
```

if NulElts > 4, the Indices arrays will be accessed OOB. Might be as simple as bumping the size to 8.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 132069] [ms] [llvm-ml] Support for `NEAR` in `EXTERN` directives

2025-03-19 Thread LLVM Bugs via llvm-bugs


Issue

132069




Summary

[ms] [llvm-ml] Support for `NEAR` in `EXTERN` directives




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  MisterDA
  




```console
$ cat > test.asm < Type may be any of `near`, `far`, `proc`, `byte`, `word`, `dword`, `qword`, `tbyte`, `abs` (absolute, which is a constant), or some other user defined type.

I'm interested in support for `near`.
Cf #131707 cc @ericastor 



___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 132074] [ms] [llvm-ml] Support macros with parenthesis and commas delimiters between arguments

2025-03-19 Thread LLVM Bugs via llvm-bugs


Issue

132074




Summary

[ms] [llvm-ml] Support macros with parenthesis and commas delimiters between arguments




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  MisterDA
  




I have a MASM file that has a macro that I call with parenthesis and which uses a comma as a delimiter between arguments, as parameters may contain spaces:

```asm
i = 0
SubstitutionMacro MACRO _type:REQ, name:REQ
field_&name EQU i
i = i + 1
 EXITM <>
ENDM

SubstitutionMacro(int, a)
SubstitutionMacro(int*, b)
SubstitutionMacro(struct type, c)
SubstitutionMacro(struct type*, d)
 .CODE
mov r12, field_a
mov r13, field_b
mov r14, field_c
mov r15, field_d
END
```

`llvm-ml` currently fails at assembling this file:

``` console
$ ml64 -nologo -Cp -c -Fo test.obj test.asm
 Assembling: test.asm
$ objdump -D --section='.text$mn' test.obj

test.obj: file format pe-x86-64


Disassembly of section .text$mn:

 <.text$mn>:
   0:   49 c7 c4 00 00 00 00mov$0x0,%r12
   7:   49 c7 c5 01 00 00 00mov$0x1,%r13
   e:   49 c7 c6 02 00 00 00mov $0x2,%r14
  15:   49 c7 c7 03 00 00 00mov$0x3,%r15
```

Cf #129905 cc @ericastor 


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 132071] [RISC-V] Miscompile on rv64gcv with -O[23]

2025-03-19 Thread LLVM Bugs via llvm-bugs


Issue

132071




Summary

[RISC-V] Miscompile on rv64gcv with -O[23]




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  ewlu
  




Testcase:
```c
short a[12];
short b[12];
long long al;
int f;
short m = 31554;
long long q[12];
int main() {
  for (int i = 0; i < 2; ++i)
q[i] = 6;
  for (short i = 0; i < 12; i += m - 31553) {
a[i] = q[i];
b[i] = f > q[i];
  }
  for (int i = 0; i < 12; ++i)
al += a[i];
  for (int i = 0; i < 2; ++i)
al += b[i];
  __builtin_printf("%llu\n", al);
}
```

Commands:
```
# riscv
$ ./bin/clang -march=rv64gcv_zvl256b -flto -O2 red.c -o user-config.out
$ QEMU_CPU=rv64,vlen=256,rvv_ta_all_1s=true,rvv_ma_all_1s=true,v=true,vext_spec=v1.0,zve32f=true,zve64f=true timeout --verbose -k 0.1 4 ./bin/qemu-riscv64 user-config.out 1
10

$ ./bin/clang -march=rv64gcv_zvl256b -flto -O3 red.c -o user-config.out
$ QEMU_CPU=rv64,vlen=256,rvv_ta_all_1s=true,rvv_ma_all_1s=true,v=true,vext_spec=v1.0,zve32f=true,zve64f=true timeout --verbose -k 0.1 4 ./bin/qemu-riscv64 user-config.out 1
10

# x86
$ ./native.out 1
12
```

Godbolt: https://godbolt.org/z/7hMW6eerT

Bisected to 9b7282e545d5e47315e3ffb9e5e00d0fb547c8e3 as the first bad commit

Found via fuzzer


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 132084] [ms] [llvm-ml] Allow size directives on register operands

2025-03-19 Thread LLVM Bugs via llvm-bugs


Issue

132084




Summary

[ms] [llvm-ml] Allow size directives on register operands




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  MisterDA
  




```asm
.CODE
mov rsp, r11
 mov qword ptr rsp, r11
mov rsp, qword ptr r11
mov qword ptr rsp, qword ptr r11
END
```

```console
$ ml64 -nologo -Cp -c -Fo test.obj -W3 test.asm
 Assembling: test.asm
$ objdump -D --section='.text$mn' test.obj

test.obj: file format pe-x86-64


Disassembly of section .text$mn:

 <.text$mn>:
   0:   49 8b e3mov%r11,%rsp
   3:   49 8b e3mov%r11,%rsp
   6:   49 8b e3mov %r11,%rsp
   9:   49 8b e3mov%r11,%rsp
$ llvm-ml -m64 -nologo -c -Fo test.obj test.asm
test.asm:3:23: error: expected memory operand after 'ptr', found register operand instead
mov qword ptr rsp, r11
  ^
test.asm:4:28: error: expected memory operand after 'ptr', found register operand instead
mov rsp, qword ptr r11
   ^
test.asm:5:23: error: expected memory operand after 'ptr', found register operand instead
mov qword ptr rsp, qword ptr r11
  ^
```

Not sure if the size directive is needed here or not, but LLVM exits with a error. Microsoft's ml64 accepts the move register-to-register with a size directive.

cc @ericastor 


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 132020] Can `-fmemory-profile-use` be used together with `-fmemory-profile`?

2025-03-19 Thread LLVM Bugs via llvm-bugs


Issue

132020




Summary

Can `-fmemory-profile-use` be used together with `-fmemory-profile`?




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  zcfh
  




## Background
I am currently trying to use memprof to optimize my program, but the test did not find obvious performance gains. **I now hope to develop a tool to observe whether the optimization is really effective.**
## Imagination
My idea is to use memory-profile and memory-profile-use at the same time, so that the memory access after memprof optimization can be collected. Then do some post-processing on this information to determine whether the correct clone is made. Furthermore, can a heat map be drawn based on the memory access situation, similar to BOLT.
## Question
1. I made the first attempt and found that `-fmemory-profile` and `-fmemory-profile-use` cannot be used at the same time. Is this not allowed by design?
2. **Are there other observation tools?** I looked through some options and only found `memprof-export-to-dot`, but this can only prove whether the clone is done, and cannot reflect whether the clone result is good enough.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 132068] CUDA/HIP: lambda capture of constexpr variable inconsistent between host and device

2025-03-19 Thread LLVM Bugs via llvm-bugs


Issue

132068




Summary

CUDA/HIP: lambda capture of constexpr variable inconsistent between host and device




  Labels
  
  



  Assignees
  
  



  Reporter
  
  mkuron
  




Consider the following bit of HIP code:

```c++
#include 

using std::max;

template
static void __global__
kernel(F f)
{
 f(1);
}

void test(float const * fl, float const * A, float * Vf)
{
 float constexpr small(1.0e-25);

  auto f = [=] __device__ __host__ (unsigned int n) {
float const value = max(small, fl[0]);
Vf[0] = value * A[0];
  };
  static_assert(sizeof(f) == sizeof(fl) + sizeof(A) + sizeof(Vf));
  kernel<<<1,1>>>(f);
}
```
The `static_assert` fails in the host-side compilation but succeeds in the device-side compilation. This means that the layout of the struct synthesized from the lambda is inconsistent between host and device, so if you use any of the captured variables on the device side, they will contain the data of some of the other variables. You can also use `-Xclang -fdump-record-layouts` to see that. Evidently the `constexpr` variable is part of the captured variables only on the host side, but not on the device side.
With `--cuda-host-only`:
```
*** Dumping AST Record Layout
 0 | class (lambda at :23:12)
 0 | const float * 
 8 |   const float 
16 |   float * 
 24 |   const float * 
   | [sizeof=32, dsize=32, align=8,
 |  nvsize=32, nvalign=8]
```
With `--cuda-device-only`:
```
*** Dumping AST Record Layout
 0 | class (lambda at :23:12)
 0 |   const float * 
 8 |   float * 
16 |   const float * 
 | [sizeof=24, dsize=24, align=8,
   |  nvsize=24, nvalign=8]
```
Godbolt: https://cuda.godbolt.org/z/KE789sevs.

When you compile the exact same code for CUDA, this does not happen. However, if you add
```c++
template ::value, int> = 0>
__host__ T max(const T a, const T b) {
return std::max(a, b);
}
```
after line 3 of the code at the top, you get the exact same layout discrepancy as with HIP. See https://cuda.godbolt.org/z/e3Ybr4hK1.

I can replace the `[=]` with `[fl, A, Vf]` and if that `__host__ T max` overload is present, it tells me that _variable 'small' cannot be implicitly captured in a lambda with no capture-default specified_, but if I leave out that overload it does not show that error message.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs