[llvm-bugs] [Bug 122872] Add `vpavgb` and `vpavgw` patterns

2025-01-14 Thread LLVM Bugs via llvm-bugs


Issue

122872




Summary

Add `vpavgb` and `vpavgw` patterns




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  Validark
  




[Godbolt link](https://zig.godbolt.org/#g:!((g:!((g:!((h:codeEditor,i:(filename:'1',fontScale:14,fontUsePx:'0',j:3,lang:zig,selection:(endColumn:1,endLineNumber:1,positionColumn:1,positionLineNumber:1,selectionStartColumn:1,selectionStartLineNumber:1,startColumn:1,startLineNumber:1),source:'export+fn+foo(a:+@Vector(64,+u8),+b:+@Vector(64,+u8))+@Vector(64,+u8)+%7B%0Areturn+(a+%2B+b+%2B+@as(@TypeOf(a),+@splat(1)))+%3E%3E+@splat(1)%3B%0A%7D%0A'),l:'5',n:'0',o:'Zig+source+%233',t:'0')),header:(),k:53.18930041152264,l:'4',m:100,n:'0',o:'',s:0,t:'0'),(g:!((h:compiler,i:(compiler:ztrunk,filters:(b:'0',binary:'1',binaryObject:'1',commentOnly:'0',debugCalls:'1',demangle:'0',directives:'0',execute:'1',intel:'0',libraryCode:'0',trim:'1',verboseDemangling:'0'),flagsViewOpen:'1',fontScale:14,fontUsePx:'0',j:1,lang:zig,libs:!(),options:'-O+ReleaseFast+-target+x86_64-linux+-mcpu%3Dznver5',overrides:!(),selection:(endColumn:1,endLineNumber:1,positionColumn:1,positionLineNumber:1,selectionStartColumn:1,selectionStartLineNumber:1,startColumn:1,startLineNumber:1),source:3),l:'5',n:'0',o:'+zig+trunk+(Editor+%233)',t:'0')),header:(),k:46.81069958847738,l:'4',m:100,n:'0',o:'',s:0,t:'0')),l:'2',n:'0',o:'',t:'0')),version:4)

```zig
export fn foo(a: @Vector(64, u8), b: @Vector(64, u8)) @Vector(64, u8) {
return (a + b + @as(@TypeOf(a), @splat(1))) >> @splat(1);
}
```

Gives:


```asm
.LCPI0_1:
.byte   0
 .byte   128
.byte   64
.byte   32
.byte   16
 .byte   8
.byte   4
.byte   2
foo:
vpaddb zmm0, zmm0, zmm1
vpternlogd  zmm1, zmm1, zmm1, 255
 vpsubb  zmm0, zmm0, zmm1
vgf2p8affineqb  zmm0, zmm0, qword ptr [rip + .LCPI0_1]{1to8}, 0
ret
```

Could probably give the `vpavgb` instruction?


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122869] [MLIR] Bufferization behaving differently based on the position of extract_slice.

2025-01-14 Thread LLVM Bugs via llvm-bugs


Issue

122869




Summary

[MLIR] Bufferization behaving differently based on the position of extract_slice.




  Labels
  
mlir:bufferization
  



  Assignees
  
  



  Reporter
  
  pashu123
  




**Input IR:**

```
func.func @test_one(%arg0: index, %arg1: vector<64x64xf32>, %arg2: tensor<2x4096x10x64xf16>) -> tensor<1x64x1x64xf16> {
  %c0 = arith.constant 0 : index
  %0 = tensor.empty() : tensor<64x64xf16>
  %1 = arith.truncf %arg1 : vector<64x64xf32> to vector<64x64xf16>
  %2 = vector.transfer_write %1, %0[%c0, %c0] {in_bounds = [true, true]} : vector<64x64xf16>, tensor<64x64xf16>
  %extracted_slice = tensor.extract_slice %arg2[%arg0, 0, 0, 0] [1, 64, 1, 64] [1, 1, 1, 1] : tensor<2x4096x10x64xf16> to tensor<1x64x1x64xf16>
  %inserted_slice = tensor.insert_slice %2 into %extracted_slice[0, 0, 0, 0] [1, 64, 1, 64] [1, 1, 1, 1] : tensor<64x64xf16> into tensor<1x64x1x64xf16>
  return %inserted_slice : tensor<1x64x1x64xf16>
}

func.func @test_two(%arg0: index, %arg1: vector<64x64xf32>, %arg2: tensor<2x4096x10x64xf16>) -> tensor<1x64x1x64xf16> {
  %c0 = arith.constant 0 : index
  %0 = tensor.empty() : tensor<64x64xf16>
  %1 = arith.truncf %arg1 : vector<64x64xf32> to vector<64x64xf16>
  %extracted_slice = tensor.extract_slice %arg2[%arg0, 0, 0, 0] [1, 64, 1, 64] [1, 1, 1, 1] : tensor<2x4096x10x64xf16> to tensor<1x64x1x64xf16>
  %2 = vector.transfer_write %1, %0[%c0, %c0] {in_bounds = [true, true]} : vector<64x64xf16>, tensor<64x64xf16>
  %inserted_slice = tensor.insert_slice %2 into %extracted_slice[0, 0, 0, 0] [1, 64, 1, 64] [1, 1, 1, 1] : tensor<64x64xf16> into tensor<1x64x1x64xf16>
  return %inserted_slice : tensor<1x64x1x64xf16>
}
```
**Command**:  `mlir-opt above.mlir -eliminate-empty-tensors -canonicalize`

**Output**

```
module {
 func.func @test_one(%arg0: index, %arg1: vector<64x64xf32>, %arg2: tensor<2x4096x10x64xf16>) -> tensor<1x64x1x64xf16> {
%c0 = arith.constant 0 : index
%0 = tensor.empty() : tensor<64x64xf16>
%1 = arith.truncf %arg1 : vector<64x64xf32> to vector<64x64xf16>
%2 = vector.transfer_write %1, %0[%c0, %c0] {in_bounds = [true, true]} : vector<64x64xf16>, tensor<64x64xf16>
%extracted_slice = tensor.extract_slice %arg2[%arg0, 0, 0, 0] [1, 64, 1, 64] [1, 1, 1, 1] : tensor<2x4096x10x64xf16> to tensor<1x64x1x64xf16>
%inserted_slice = tensor.insert_slice %2 into %extracted_slice[0, 0, 0, 0] [1, 64, 1, 64] [1, 1, 1, 1] : tensor<64x64xf16> into tensor<1x64x1x64xf16>
return %inserted_slice : tensor<1x64x1x64xf16>
  }

  func.func @test_two(%arg0: index, %arg1: vector<64x64xf32>, %arg2: tensor<2x4096x10x64xf16>) -> tensor<1x64x1x64xf16> {
%c0 = arith.constant 0 : index
%0 = arith.truncf %arg1 : vector<64x64xf32> to vector<64x64xf16>
 %extracted_slice = tensor.extract_slice %arg2[%arg0, 0, 0, 0] [1, 64, 1, 64] [1, 1, 1, 1] : tensor<2x4096x10x64xf16> to tensor<1x64x1x64xf16>
 %extracted_slice_0 = tensor.extract_slice %extracted_slice[0, 0, 0, 0] [1, 64, 1, 64] [1, 1, 1, 1] : tensor<1x64x1x64xf16> to tensor<64x64xf16>
%1 = vector.transfer_write %0, %extracted_slice_0[%c0, %c0] {in_bounds = [true, true]} : vector<64x64xf16>, tensor<64x64xf16>
%inserted_slice = tensor.insert_slice %1 into %extracted_slice[0, 0, 0, 0] [1, 64, 1, 64] [1, 1, 1, 1] : tensor<64x64xf16> into tensor<1x64x1x64xf16>
return %inserted_slice : tensor<1x64x1x64xf16>
  }

```
The only difference between `test_one` and `test_two` is the placement of `tensor.extract_slice`. test_one doesn't get rid of the empty buffer, whereas test_two gets rid of the empty buffer and reuses the extracted slice.

@matthias-springer Could you suggest what would be happening here? Do you know if it is the intended behaviour?



___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122891] Running lowermodulelds multiple times needs to be thinlto only, breaking fortran at present

2025-01-14 Thread LLVM Bugs via llvm-bugs


Issue

122891




Summary

Running lowermodulelds multiple times needs to be thinlto only, breaking fortran at present




  Labels
  
new issue
  



  Assignees
  
arsenm,
ergawy,
JonChesterfield,
Pierre-vh,
jhuber6
  



  Reporter
  
  JonChesterfield
  




As of https://github.com/llvm/llvm-project/pull/85626 and https://github.com/llvm/llvm-project/pull/75333 the lowermodulelds codegen pass is run as part of LTO. That doesn't work - the pass was designed to run once as part of codegen where it can globally allocate variables to the LDS space.

There is a narrow exemption carved out to unblock thinlto - provided the IR module is carved up into independent modules prior to codegen, no calls or references between them, running the allocator on each subgraph works ok and a second run during codegen over the entire module is a no-op. There's a partial check to notice when that invariant is breached - if the input IR has some allocated variables and some non-allocated variables, the pass aborts. That's the error message which @ergawy reported to me for Fortran.

I think the pass should be added to the thinlto pipeline and removed from the full lto pipeline, and generally only run once during codegen except for the thinlto case.

Arguments could be made that the pass should cope with being run multiple times on various bits of IR and spliced together. The problem with running on subgraphs is the reachable test can't be done - we don't know if a call to an external function accesses some visible LDS, or if a function can be called by an external kernel. A correct lowering then looks like a lot of table lookups and overallocation, at which point the user is likely to discover they've run out of LDS and/or have occupancy problems.

Tagging Joseph as well since I think he moved the openmp pipeline to use LTO at some point, in which case that's also vulnerable to this pattern. I suspect we're getting by at present because O0 doesn't get much use and most compilation flows are straightforward, e.g. they don't run opt on bits of IR manually.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122892] cannot compile this l-value expression yet

2025-01-14 Thread LLVM Bugs via llvm-bugs


Issue

122892




Summary

cannot compile this l-value _expression_ yet




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  egorpugin
  




```
clang++ -v
clang version 20.0.0git (https://github.com/llvm/llvm-project c047a5b3f6e2295dd74f1e8f17f1a023150b246c)
Target: x86_64-pc-windows-msvc
Thread model: posix
```

`clang++ main.cpp -std=c++20`

```
struct A {
void f() {
l(r);
}
 static void r() {}
static auto l(auto &&f) {}// error
  //static void l(auto &&f) {}// ok
};

int main() {
A k;
 k.f();
}
```


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122929] [DependenceAnalysis] Missing dependency detection for instructions accessing same memory location on different iterations

2025-01-14 Thread LLVM Bugs via llvm-bugs


Issue

122929




Summary

[DependenceAnalysis] Missing dependency detection for instructions accessing same memory location on different iterations




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  1997alireza
  




Dependence analysis is capable of checking the dependency of an instruction with itself but there is a bug there; If the dependency occurs by an instruction through accessing the same memory location in different loop iterations, DA is not able to detect the dependency. For example, the _expression_ `A[i - j]` where `i = j = 0` and `i = j = 1` accesses the same memory location at different iterations, but the current analysis fails to detect this dependency, which will lead to incorrect analysis in some optimization passes such as loop-interchange.
Consider the following code:
```cpp
int main() {
 const int N = 21;
int A[N] = {0};
int *B = &A[10];

for (int i = 1; i <= 10; i++)
for (int j = 10; j >= 1; j--)
B[i - j] = 2 * i - j;

return 0;
}
```
This example can pass the legality check in loop interchange pass based on the information provided by DA, which would give incorrect results if loop interchange was performed. In the original code, the final value is stored in B[0] when `i = j = 10`, so B[0] is set to 10. After loop interchange, the last access to B[0] happens at `i = j = 1`, which incorrectly sets B[0] to 1.
The issue arises from an incorrect reasoning in the depends() API. There are two variables in the depends() API that are involved in this issue. First one is `PossiblyLoopIndependent` which will be set to false if the src and dst cannot have loop-independent dependency. Another variable is `AllEqual` that is computed within the API and is set to true if the dependency vector elements are all `Equals`.
For our case, when we analyse the dependency of the B[i-j] store instruction with itself, PossiblyLoopIndependent is set to false, and AllEqual is true. According to [the API](https://github.com/llvm/llvm-project/blob/19557a4c8fab0dbfe9d9c53b99b7960ef211684e/llvm/lib/Analysis/DependenceAnalysis.cpp#L3970) if AllEqual == True and PossiblyLoopIndependent == false, DA concludes there is no dependency, which is not correct and has caused the bug.

Additionally, by modifying a test case in DA, we can expose the same bug. In [this test case](https://github.com/llvm/llvm-project/blob/19557a4c8fab0dbfe9d9c53b99b7960ef211684e/llvm/test/Analysis/DependenceAnalysis/ExactRDIV.ll#L583), storing to `A[11*i - j]` has no dependency with itself. By modifying the subscript to `A[10*i - j]`, the memory location `A[0]` is accessed in iterations `i = j = 0` and `i = 1, j = 10`. Still, DA does not detect any dependency.

\+ In addition to this bug, it seems that the PossiblyLoopIndependent variable has lost its intended purpose. This flag is always set to true in all cases when depends() is called, hence we may want to reconsider the utility of this variable and possibly remove it from the function signature entirely. We can post a patch for that.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122932] [HLSL] sizeof(bool) should always be 4

2025-01-14 Thread LLVM Bugs via llvm-bugs


Issue

122932




Summary

[HLSL] sizeof(bool) should always be 4




  Labels
  
new issue
  



  Assignees
  
spall
  



  Reporter
  
  spall
  




Currently HLSL functions which return or take a bool, do so with bool as an i1, but bool should always be an i32. 
Related the i1 is extended to an i8 in the function body, which is an error. 
https://hlsl.godbolt.org/z/Ksqsva9q3


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122934] [TySan] possible false positive with memcpy()?

2025-01-14 Thread LLVM Bugs via llvm-bugs


Issue

122934




Summary

[TySan] possible false positive with memcpy()?




  Labels
  
false-positive
  



  Assignees
  
  



  Reporter
  
  seanm
  




I have a TySan report that I think/thought is a false positive. So I used creduce to try and minimize it.  It reduced down to the following:

```c
typedef struct {
  int a;
  int b;
  int c;
  int i;
  int d;
  int e;
  int f;
  int dim[8];
  long g
} h;
int k, m = __builtin_object_size(&k, 0);
h b;
void j();
void n(void *o) {
  __builtin___memcpy_chk(&k, o, sizeof(0), m);
  k; //  TySan complains here, line 17
}
void calloc();
h *c = calloc;
void l() {
  c->g;
  b = *c;
  j(0, b);
}
void j(int, void *p) { n(p + 64); }
void main() { l(); }
```

TySan reports:

```
==79752==ERROR: TypeSanitizer: type-aliasing-violation on address 0x000100c1c010 (pc 0x000100c168ec bp 0x00016f1eac30 sp 0x00016f1ea3b0 tid 32786625)
READ of size 4 at 0x000100c1c010 with type int accesses an existing object of type long (in  at offset 64)
#0 0x000100c168e8 in n test-preprocessed.c:17
```

Line 17 looks like it does nothing, but indeed if I comment it out, TySan no longer warns.

This reduced code is pretty nonsensical of course, but in the real code, it's using memcpy() precisely to avoid strict aliasing violations.  Isn't memcpy() supposed to be the kosher way to copy anything of any size/alignment to anything else?

Also, and maybe I should make another ticket, but it's a shame the report says:

- `(in  at offset 64)`

instead of:

- `(in 'struct h' at offset 64, field 'g')`


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122937] llvm/lib/CodeGen/LiveRangeUtils.h:40: Iterators applied to wrong data structure ?

2025-01-14 Thread LLVM Bugs via llvm-bugs


Issue

122937




Summary

llvm/lib/CodeGen/LiveRangeUtils.h:40: Iterators applied to wrong data structure ?




  Labels
  
llvm:codegen,
code-quality
  



  Assignees
  
  



  Reporter
  
  dcb314
  




Static analyser cppcheck says:

llvm/lib/CodeGen/LiveRangeUtils.h:40:6: error: Same iterator is used with different containers 'LR' and 'LR.segments'. [iterators1]

Source code is

  typename LiveRangeT::iterator J = LR.begin(), E = LR.end();
  // ...
  LR.segments.erase(J, E);

So J and E are iterators for LR, but get used on LR.segments.

I am surprised this compiles.




___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122905] [GISel] Missing combines to expose common subexpressions and eliminate them

2025-01-14 Thread LLVM Bugs via llvm-bugs


Issue

122905




Summary

[GISel] Missing combines to expose common subexpressions and eliminate them




  Labels
  
llvm:globalisel
  



  Assignees
  
  



  Reporter
  
  qcolombet
  




When running GISel on the given input IR, we end up generating essentially a one-to-one mapping with the input IR whereas most of the computation is just duplicated.
SDISel performs a much better job by doing `instcombine`-like optimizations that allow it to simplify the IR and eventually exposes the CSE opportunity.

This is not that surprising given that historically GISel has had a garbage in garbage out approach, but it may make sense to strengthen GISel combines for optimized builds.

Note: I observed this with AMDGPU but I suspect it affects all backends.

# To Reproduce #

Download the attached IR or copy paste the snippet below and run
```bash
llc -O3 -march=amdgcn -mcpu=gfx942  -mtriple amdgcn-amd-hmcsa -global-isel=<0|1> repro.ll -o isel.s 
```

# Result #

GISel ends up generating two `fdiv` instructions whereas SDISel is able to CSE the whole thing and produces just one.

GISel:
```asm
v_div_scale_f32 v1, s[0:1], v0, v0, 1.0
v_mov_b32_e32 v7, v2
v_mov_b32_e32 v2, v3
 v_mov_b32_e32 v3, v4
v_rcp_f32_e32 v4, v1
v_div_scale_f32 v5, vcc, 1.0, v0, 1.0
v_fma_f32 v8, -v1, v4, 1.0
 v_fmac_f32_e32 v4, v8, v4
v_mul_f32_e32 v8, v5, v4
 v_fma_f32 v9, -v1, v8, v5
v_fmac_f32_e32 v8, v9, v4
 v_fma_f32 v1, -v1, v8, v5
v_div_fmas_f32 v5, v1, v4, v8
 v_div_fixup_f32 v5, v5, v0, 1.0
v_div_fmas_f32 v1, v1, v4, v8
 v_div_fixup_f32 v0, v1, v0, 1.0
```

SDISel:
```asm
 v_div_scale_f32 v1, s[0:1], v0, v0, 1.0
v_rcp_f32_e32 v6, v1
 s_nop 0
v_fma_f32 v7, -v1, v6, 1.0
v_fmac_f32_e32 v6, v7, v6
v_div_scale_f32 v7, vcc, 1.0, v0, 1.0
v_mul_f32_e32 v8, v7, v6
v_fma_f32 v9, -v1, v8, v7
v_fmac_f32_e32 v8, v9, v6
v_fma_f32 v1, -v1, v8, v7
v_div_fmas_f32 v1, v1, v6, v8
v_div_fixup_f32 v0, v1, v0, 1.0
```

# IR Snippet #
```llvm
define void @foo(<1 x float> %in, ptr %out, ptr %out2) {
 %t1174 = insertvalue [4 x <1 x float>] zeroinitializer, <1 x float> %in, 0
 %t1175 = insertvalue [4 x <1 x float>] %t1174, <1 x float> %in, 1
  %t1176 = insertvalue [4 x <1 x float>] %t1175, <1 x float> %in, 2
  %t1177 = insertvalue [4 x <1 x float>] %t1176, <1 x float> %in, 3
  %t1178 = insertvalue [2 x [1 x [4 x [1 x [4 x <1 x float>] zeroinitializer, [4 x <1 x float>] %t1177, 0, 0, 0, 0
  %t1179 = insertvalue [2 x [1 x [4 x [1 x [4 x <1 x float>] %t1178, [4 x <1 x float>] %t1177, 0, 0, 1, 0
  %t1180 = insertvalue [2 x [1 x [4 x [1 x [4 x <1 x float>] %t1179, [4 x <1 x float>] %t1177, 0, 0, 2, 0
  %t1181 = insertvalue [2 x [1 x [4 x [1 x [4 x <1 x float>] %t1180, [4 x <1 x float>] %t1177, 0, 0, 3, 0
  %t1182 = insertvalue [2 x [1 x [4 x [1 x [4 x <1 x float>] %t1181, [4 x <1 x float>] %t1177, 1, 0, 0, 0
  %t1183 = insertvalue [2 x [1 x [4 x [1 x [4 x <1 x float>] %t1182, [4 x <1 x float>] %t1177, 1, 0, 1, 0
  %t1184 = insertvalue [2 x [1 x [4 x [1 x [4 x <1 x float>] %t1183, [4 x <1 x float>] %t1177, 1, 0, 2, 0
  %t1185 = insertvalue [2 x [1 x [4 x [1 x [4 x <1 x float>] %t1184, [4 x <1 x float>] %t1177, 1, 0, 3, 0
  %t1186 = extractvalue [2 x [1 x [4 x [1 x [4 x <1 x float>] %t1178, 0, 0, 0, 0, 0
  %t1187 = fdiv <1 x float> splat (float 1.00e+00), %t1186
  %t1188 = extractvalue [2 x [1 x [4 x [1 x [4 x <1 x float>] %t1178, 0, 0, 0, 0, 1
  %t1189 = fdiv <1 x float> splat (float 1.00e+00), %t1188

  store <1 x float> %t1187, ptr %out
  store <1 x float> %t1189, ptr %out2
  ret void
}
```

# Note #

SDISel is essentially able to do the equivalent of:
```bash
opt -passes=instcombine,early-cse -S repro.ll -o -
```

Which yields:
```llvm
define void @foo(<1 x float> %in, ptr %out, ptr %out2) {
  %t1187 = fdiv <1 x float> splat (float 1.00e+00), %in
 store <1 x float> %t1187, ptr %out, align 4
  store <1 x float> %t1187, ptr %out2, align 4
  ret void
}
```

[repro.ll.txt](https://github.com/user-attachments/files/18411275/repro.ll.txt)



___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122913] [SCEV] Another SEGV/stack overflow in LoopGuards

2025-01-14 Thread LLVM Bugs via llvm-bugs


Issue

122913




Summary

[SCEV] Another SEGV/stack overflow in LoopGuards




  Labels
  
llvm:SCEV,
crash-on-valid
  



  Assignees
  
juliannagele
  



  Reporter
  
  danilaml
  




Similar to https://github.com/llvm/llvm-project/issues/120615. Looks like the fix wasn't a complete one. Here is an example:

```llvm
target triple = "x86_64-unknown-linux-gnu"

define ptr @f(i32 %0) {
  switch i32 0, label %bb4 [
i32 1, label %bb4
i32 2, label %bb4
i32 3, label %bb4
i32 4, label %bb1
i32 5, label %bb4
i32 6, label %bb4
 ]

bb:   ; No predecessors!
 switch i32 0, label %bb4 [
i32 0, label %bb4
i32 1, label %bb1
 ]

bb1:  ; preds = %bb2, %bb, %1
  %2 = phi i32 [ %3, %bb2 ], [ 0, %bb ], [ 0, %1 ]
  switch i32 %0, label %bb3 [
i32 0, label %bb2
i32 1, label %bb2
i32 2, label %bb2
  ]

bb2:  ; preds = %bb1, %bb1, %bb1
  %3 = add i32 %2, 1
  %4 = icmp ult i32 %0, 0
  br i1 %4, label %bb1, label %bb4

bb3: ; preds = %bb1
  unreachable

bb4: ; preds = %bb2, %bb, %bb, %1, %1, %1, %1, %1, %1
  ret ptr null
}
```
Crashes with the same command line `opt -passes=nary-reassociate --scalar-evolution-use-expensive-range-sharpening`
godbolt: https://godbolt.org/z/4d3jo8jTz


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122925] [libc++] Consider enabling fast hardening mode (or extensive) when optimizations are disabled

2025-01-14 Thread LLVM Bugs via llvm-bugs


Issue

122925




Summary

[libc++] Consider enabling fast hardening mode (or extensive) when optimizations are disabled




  Labels
  
libc++,
hardening
  



  Assignees
  
ldionne
  



  Reporter
  
  ldionne
  




We can take for granted that someone building with `-O0` is doing some kind of debug build and doesn't care too much about performance. So it would probably make sense to enable the fast hardening mode (at least) in that case.

libstdc++ has done this: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112808


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122942] [mlir][presburger] "Isolated" local variables can produce crash or unexpected results in IntegerRelation/PresburgerRelation routines

2025-01-14 Thread LLVM Bugs via llvm-bugs


Issue

122942




Summary

[mlir][presburger] "Isolated" local variables can produce crash or unexpected results in IntegerRelation/PresburgerRelation routines




  Labels
  
mlir:presburger
  



  Assignees
  
christopherbate
  



  Reporter
  
  christopherbate
  




I've noticed an issue where "isolated" local variables can cause problems in the elimination of non-div locals (IntegerRelation -> PresburgerRelation) as well as in the set subtraction algorithm (`getSetDifference` in `PresburgerRelation.cpp`).

By "isolated" I mean that there is a local defined in the IntegerRelation system, but there may not be a constraint that links the local to other variables. This sounds contrived, but it can happen if a user constructs an IntegerRelation from an AffineMap where one of the domain variables is not used to calculate the range. Invoking `IntegerRelation::getRangeSet` would then turn all the domain variables into locals, resulting in a local which is not related to any other variable in the system.

If you then try to compute the "non-div local" representation, I noticed that the number of disjuncts can become much larger than if the local is eliminated (e.g. 2x for each unused local). Furthermore, I will sporadically run into an unreachable here https://github.com/llvm/llvm-project/blob/main/mlir/lib/Analysis/Presburger/Simplex.cpp#L1463. Will update this issue with a concise reproducer.

As a workaround, I added a routine to eliminate such locals whenever I construct an IntegerRelation. It doesn't look like any of the existing simplification routines in `IntegerRelation` know how to remove with such locals.


   


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122967] Request Commit Access For jadhbeika

2025-01-14 Thread LLVM Bugs via llvm-bugs


Issue

122967




Summary

Request Commit Access For jadhbeika




  Labels
  
infra:commit-access-request
  



  Assignees
  
  



  Reporter
  
  jadhbeika
  




### Why Are you requesting commit access ?
I would like to implement the new version of the "nowait" clause of openmp which is now takes an optional argument of type OpenMP logical type _expression_



___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122970] [libc++] Provide an observe semantic in the hardening mode

2025-01-14 Thread LLVM Bugs via llvm-bugs


Issue

122970




Summary

[libc++] Provide an observe semantic in the hardening mode




  Labels
  
libc++,
hardening
  



  Assignees
  
  



  Reporter
  
  ldionne
  




Many people would benefit from having a way to turn on hardening without crashing their application when they violate a precondition. Instead, they would want to log the failure and continue. That way, they can enable hardening and gradually fix issues that come up in production, and eventually flip the switch without endangering their stability in production.

This is akin to the observe semantic in Contracts, so we need something like that eventually anyway.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122996] [MLIR][LLVM] Incorrect #llvm.constant_range import

2025-01-14 Thread LLVM Bugs via llvm-bugs


Issue

122996




Summary

[MLIR][LLVM] Incorrect #llvm.constant_range import




  Labels
  
mlir
  



  Assignees
  
  



  Reporter
  
  Kuree
  




See godbolt: https://godbolt.org/z/x34YarsrE

`range(i32 0, -2147483648)` is imported as `#llvm.constant_range`, but `-2147483648` is internally stored as a 40-bit integer, causing the verifier to fail: https://github.com/llvm/llvm-project/blob/ebef44067bd0a2cd776b8baea39cffa7f602ce7b/mlir/lib/Dialect/LLVMIR/IR/LLVMAttrs.cpp#L290-L293


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122974] [llvm-exegesis][RISCV] computeAliasingInstructions in SerialSnipperGenerate generates instructions that can't be assembled

2025-01-14 Thread LLVM Bugs via llvm-bugs


Issue

122974




Summary

[llvm-exegesis][RISCV] computeAliasingInstructions in SerialSnipperGenerate generates instructions that can't be assembled




  Labels
  
backend:RISC-V,
tools:llvm-exegesis
  



  Assignees
  
  



  Reporter
  
  topperc
  




I tried to run through all RISC-V opcodes available on my SiFive P550 system using -opcode-index=-1. I got some crashes trying to assemble pseudo instructions.

Should llvm-exegesis be filtering pseudos and custom insertion instructions in this function?

CC: @boomanaiden154 @mshockwave 


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122978] [libc] Make malloc resistant to overflow

2025-01-14 Thread LLVM Bugs via llvm-bugs


Issue

122978




Summary

[libc] Make malloc resistant to overflow




  Labels
  
libc
  



  Assignees
  
mysterymath
  



  Reporter
  
  mysterymath
  




The malloc implmentation in libc has been only spoaradically careful to prevent overflow, but it hasn't been systematically careful. It should be the case that no value provided to any surface area of the allocator (the allocation functions, `_end`, and `__llvm_libc_heap_limit`) can cause it to produce erroneous behavior due to overflow. Tests should be added for the various possible overflow corner cases, checks added to secure against this possibility, and any spurious checks removed.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 123006] [bolt] Report error on .so instrumentation: unsupported CFI opcode

2025-01-14 Thread LLVM Bugs via llvm-bugs


Issue

123006




Summary

[bolt] Report error on .so instrumentation: unsupported CFI opcode




  Labels
  
BOLT
  



  Assignees
  
  



  Reporter
  
  whousemyname
  




I want to use llvm-bolt to instrument a .so file on the android platform, but it reported the following error:
`
BOLT-INFO: shared object or position-independent executable detected
BOLT-INFO: Target architecture: aarch64
BOLT-INFO: BOLT version: 446a426436c0b7e457992981d3a1f2b4fda19992
BOLT-INFO: first alloc address is 0x0
BOLT-INFO: creating new program header table at address 0x1c0, offset 0x1c0
BOLT-WARNING: debug info will be stripped from the binary. Use -update-debug-sections to keep it.
BOLT-INFO: enabling relocation mode
BOLT-INFO: forcing -jump-tables=move for instrumentation
unsupported CFI opcode
UNREACHABLE executed at D:\github-projects\llvm-project\bolt\lib\Core\BinaryFunction.cpp:2591!
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Exception Code: 0x8003
unsupported CFI opcode
UNREACHABLE executed at D:\github-projects\llvm-project\bolt\lib\Core\BinaryFunction.cpp:2591!
`
command to execute:` \llvm-bolt.exe .\libzzz-debug.so -instrument -o .\libzzz-debug.so.instrumented`
What is the reason for this phenomenon? Does it mean that bolt cannot instrument .so files, or the generation process of my .so files does not meet the requirements of instrumentation?



___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122971] wrong management of parameter substitution in a concept-id

2025-01-14 Thread LLVM Bugs via llvm-bugs


Issue

122971




Summary

wrong management of parameter substitution in a concept-id




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  mrussoLuxoft
  




In the code reported below: (https://godbolt.org/z/TYKhzqhEf)

there are three couples of overloaded function templates, with different constraints.
Let's say that in the couples of func1 and func2, the first overload is designed to
work with std::set and the second with std::map.

However, func1{}> leads to an error for Clang and gcc, because lambda expressions are not immediate context.

This is correctly related to the following standard text (current draft):

[expr.prim.req.general] - p5:
"... can result in the formation of invalid types or expressions in the immediate context of its requirements ... In such cases, the requires-_expression_ evaluates to false; it does not cause the program to be ill-formed."

[temp.constr.atomic] - p3:
"To determine if an atomic constraint is satisfied, the parameter mapping and template arguments are first substituted into its _expression_. If substitution results in an invalid type or _expression_ in the immediate context of the atomic constraint, the constraint is not satisfied. ... "

[temp.deduct.general] - p9:
"When substituting into a lambda-_expression_, substitution into its body is not in the immediate context. ..."


However, for func1b{}>, gcc and Clang behave differently. Indeed, Clang still considers the error in lambda _expression_ code, whereas gcc ignores it because this time the concept Cb does not use its parameters, relying on the following standard text:

[temp.constr.normal] - p(1.4):
"The normal form of a concept-id C is the normal form of the constraint-_expression_ of C, after substituting A1 , A2 , ..., An for C’s respective template parameters in the parameter mappings in each atomic constraint. ..."

which means that the substitution leads to no error if a parameter is not used.


I initially supposed this was an error for gcc, and posted a potential bug for gcc [here](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118398), but that discussion led to understand that's a problem of Clang.
gcc guys seemed to remember a known problem of Clang, but I tried to search with keywords 'concept', 'parameter', and others, and I could not find it. Hope I am not duplicating.


```
#include 
#include 
#include 

template
concept C = requires(Container c, KeyExtractor&& keyExtractor){
c.lower_bound(keyExtractor(c.begin()));
};

template
concept Cb = true;

template
requires C
void func1([[maybe_unused]] const Container& c){
std::cout << "func1 first overload\n";
}

template
requires Cfirst;})>
void func1([[maybe_unused]] const Container& c){
std::cout << "func1 second overload\n";
}

template
requires Cb
void func1b([[maybe_unused]] const Container& c){
std::cout << "func1b first overload\n";
}

template
requires Cbfirst;})>
void func1b([[maybe_unused]] const Container& c){
std::cout << "func1b second overload\n";
}

template
requires requires(Container c){*c.begin();}
&& C
void func2([[maybe_unused]] const Container& c)
{
std::cout << "func2 first overload\n";
}

template
requires requires(Container c){c.begin()->first;}
&& Cfirst;})>
void func2([[maybe_unused]] const Container& c)
{
std::cout << "func2 second overload\n";
}

int main(){
func1(std::set{}); // - gcc and clang manage lambda as non-immediate
//   context, so getting a compilation error.
// - MVSC rejects second overload and selects the
//   first one, that is, it considers failed
//   constraints due to unfair "it->first" code
//   in the lambda _expression_.
func1(std::map{}); // all compilers correctly select second overload.
// This time, no reverse problem about "*it",
// and then concept C fails for set iterators.

func1b(std::set{}); // - gcc considers both overloads as eligible,
 //   ignoring the part "it->first" for second overload,
 //   because it is not used in the concept definition.
 // - clang consistently behaves instead as for func1.
 // - MVSC rejects instead second overload, exactly as
 //   for func1.
//func1b(std::map{}); // all compilers correctly consider ambiguous overloads.

func2(std::set{}); // all compilers select first overload (i.e., gcc and clang
// behaves differently about lambda type resolution, because
// the result of the normal form for the constraints, is
   

[llvm-bugs] [Bug 122948] clang-format doesn't handle the spaceship "<=>" symbol correctly

2025-01-14 Thread LLVM Bugs via llvm-bugs


Issue

122948




Summary

clang-format doesn't handle the spaceship "<=>" symbol correctly




  Labels
  
clang-format
  



  Assignees
  
  



  Reporter
  
  chriskot870
  




I was referred here to post a potential bug in the clang-format application.

I am running Ubuntu 24.04. The version of clang-format is:

```
$ clang-format --version
Ubuntu clang-format version 18.1.3 (1ubuntu1)
```

I use this file:
```
#include 
#include 

int main() {
  int x = 5;
  int y = 6;

  if ((x <=> y) != 0) {
printf("x is not equal too y\n");
  }
}
```

I can compile and run it.

```
$ g++ -std=c++23 spaceship_test.cpp
$ ./a.out
x is not equal too y
```

I then run clang-format and try to compile:
```
$ clang-format -i spaceship_test.cpp
$ g++ -std=c++23 spaceship_test.cpp
spaceship_test.cpp: In function ‘int main()’:
spaceship_test.cpp:11:13: error: expected primary-_expression_ before ‘>’ token
   11 |   if ((x <= > y) != 0) {
  | ^
```

Indeed in the file the <=> has been changed to "<= >" Notice the space symbol after the "=".

```
#include 
#include 

int main() {
  int x = 5;
  int y = 6;

  if ((x <= > y) != 0) {
printf("x is not equal too y\n");
  }
}
```

Here is the clang-format file I am using [clang-format.txt](https://github.com/user-attachments/files/18414723/clang-format.txt). I had to add a .txt in order for the tool to accept the .clang-format file


I assume this is some sort of bug in clang-format.

Thanks
Chris


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122954] [TySan] False positive with global structs

2025-01-14 Thread LLVM Bugs via llvm-bugs


Issue

122954




Summary

[TySan] False positive with global structs




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  TheLastRar
  




This was previously reported in the discourse, since TySan has now been merged I'm reposting it on Github

This is the reduced version from [gbMattN](https://discourse.llvm.org/t/reviving-typesanitizer-a-sanitizer-to-catch-type-based-aliasing-violations/66092/22), the original report used a `std::array`

Tysan reports a type violation when arr is in the global scope https://godbolt.org/z/TxeaaY5cc

```C
#include 

struct array_type{
int inner[1];
};

struct array_type arr;

int main() {
  arr.inner[0] = 5;
  return 0;
}
```

TySan currently reports
```
==1==ERROR: TypeSanitizer: type-aliasing-violation on address 0x5d14c0846c5c (pc 0x5d14bfeefb4b bp 0x7ffd3e06b7a0 sp 0x7ffd3e06b730 tid 1)
WRITE of size 4 at 0x5d14c0846c5c with type int accesses an existing object of type array_type
#0 0x5d14bfeefb4a (/app/output.s+0x2ab4a)
```



___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 123001] Request Commit Access For sebpop

2025-01-14 Thread LLVM Bugs via llvm-bugs


Issue

123001




Summary

Request Commit Access For sebpop




  Labels
  
infra:commit-access-request
  



  Assignees
  
  



  Reporter
  
  sebpop
  




I am requesting commit access to be able to merge my own approved patches: https://github.com/llvm/llvm-project/pull/116628 https://github.com/llvm/llvm-project/pull/116631

I am fixing the data dependence analysis and working on the loop optimizers.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 123021] Failed to eliminate unreachable code

2025-01-14 Thread LLVM Bugs via llvm-bugs


Issue

123021




Summary

Failed to eliminate unreachable code




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  CrazyboyQCD
  




C: https://godbolt.org/z/8Y5eTreoE
Rust: https://godbolt.org/z/vqv7MsnPK
Under condition `x <= y` there should be no codegen for the `else` branch.
```c
void f1(float x, float y, float* z) {
 if (x <= y) return;
if (x > y)
*z = 0.0;
else
 *z = x - y;
}
```
```c
void f2(float x, float y, float* z) {
if (x > y) *z = 0.0;
}
```


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122985] [clang-tidy] Check request: detect saving stack addresses beyond their lifetime

2025-01-14 Thread LLVM Bugs via llvm-bugs


Issue

122985




Summary

[clang-tidy] Check request: detect saving stack addresses beyond their lifetime




  Labels
  
clang-tidy
  



  Assignees
  
  



  Reporter
  
  asund
  




This seems to be missed by existing stack address check as the address doesn't escape scope of the stack but is preserved between scopes using a static variable.
```
auto f() {
  process stack_array[] = { method1, method2, method3 };
  static *process process_to_use = nullptr;

  if (!process) {
// some expensive init later...
process = &stack_array[n];
 }

  if (!process) {
   process->do_processing();  // segfault
 }
}
```
process_to_use has a stale value when the function is called again. stack_array needs to have static lifetime in this case.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 122959] Clang failed 100 tests on llvmorg-19.1.7

2025-01-14 Thread LLVM Bugs via llvm-bugs


Issue

122959




Summary

Clang failed 100 tests on llvmorg-19.1.7




  Labels
  
clang
  



  Assignees
  
  



  Reporter
  
  leecommamichael
  




**The windows binaries do not contain a target for wasm, so I decided to build it myself.** This is a fresh install of Windows 11 using Visual Studio 2022. I compiled with Cmake and Ninja as adviced on https://clang.llvm.org/get_started.html

I ran `cmake -GNinja -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS=clang ..\llvm`
then `ninja check` as advised on the "get started" page.

```
Testing Time: 893.79s

Total Discovered Tests: 98695
  Skipped  :32 (0.03%)
  Unsupported  :  2898 (2.94%)
  Passed   : 95462 (96.72%)
  Expectedly Failed:   203 (0.21%)
  Failed   :   100 (0.10%)
```

This is on tag `llvmorg-19.1.7` commit `cd708029e0b2869e80abe31ddb175f7c35361f90`

Is this to be expected?


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 123023] Sanitizer test regressions with CLANG_CONFIG_FILE_SYSTEM_DIR building with GCC after #60394 fix

2025-01-14 Thread LLVM Bugs via llvm-bugs


Issue

123023




Summary

Sanitizer test regressions with CLANG_CONFIG_FILE_SYSTEM_DIR building with GCC after #60394 fix




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  xry111
  




The fix for #60394 has made setting `CLANG_NO_DEFAULT_CONFIG=1` only when the build compiler supports `--no-default-config`, but when we build LLVM suite with GCC, the build compiler obviously does not support this option.  Thus when clang is configured with `CLANG_CONFIG_FILE_SYSTEM_DIR` and the config file contains some compiler options, some tests start to fail:

```
Failed Tests (2):
 DataFlowSanitizer-x86_64 :: custom.cpp
  DataFlowSanitizer-x86_64 :: origin_unaligned_memtrans.c
```


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs