[llvm-bugs] [Bug 136277] [OpenMP][Flang] ‘Target update’ not working

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136277




Summary

[OpenMP][Flang] ‘Target update’ not working




  Labels
  
flang
  



  Assignees
  
  



  Reporter
  
  Yan-44
  




Hi,
The simple program below reproduces one of the ‘target update’ issues occurring in a large Fortran code compiled with rocm-6.4.0/flang. It shows that ‘target update’ fails to update variable ‘val’. Why?
```
$ cat test39_target_update.f90

PROGRAM TARGET_UPDATE 
 use omp_lib
 implicit none
 integer, parameter :: Nmax=250
 integer :: i, nteams, nthreads
 logical :: initial_device, is_val_assigned_on_device
 double precision :: val
 double precision, dimension(:),allocatable :: X
 double precision,dimension(Nmax) :: expected
 
 !allocating and initialising
 allocate(X(Nmax))
 initial_device=.true.
 is_val_assigned_on_device=.false.
 do i = 1, Nmax
  expected(i) = -2._8
  X(i)=-2._8
 end do
 
 !$omp target data map(alloc:val) map(to:X)
 !checking correct assignment of 'X' on host for j=1058000 !randomly selected j value
 write(*,'(A,F6.1,A,F6.1)')"On host : X(1058000) = ",X(1058000),", expected X(1058000) = ",expected(1058000)
 !
 !$omp target map(tofrom:initial_device,is_val_assigned_on_device) map(from:nteams,nthreads)
 !Assigning 'val' on the device
 val=X(1058000)
 if (val==-2._8) is_val_assigned_on_device=.true.
 !$OMP teams distribute parallel do
 do i = 1, Nmax
   initial_device = omp_is_initial_device()
   X(i)=X(i)-val !hence, X(i) should be '0' if 'val' was correctly assigned
   !$ if (i==1) then
   !$  nteams= omp_get_num_teams()
   !$  nthreads= omp_get_num_threads()
   !$ end if
 end do
 !$omp end teams distribute parallel do
 !$OMP END TARGET

 if (initial_device) then
write(*,*)"running on host"
write(*,'(A,F6.1)')"val = ",val
 else
write(*,'(A,I3,A,I3,A)')"running on device with ",nteams," teams and ",nthreads," threads"
write(*,'(A,L1)')"is_val_assigned_on_device = ",is_val_assigned_on_device
write(*,*)"-"
write(*,'(A)')"before update:"
write(*,'(A,E13.2,A)')" val = ",val,", expected val = undetermined"
write(*,'(A,F6.1,A,F6.1)')"X(1058000) =",X(1058000),",expected X(1058000) = ",expected(1058000)
write(*,*)"---"
!$OMP target update from(val)
write(*,'(A,F6.1,A,F6.1)')"after update, val = ",val,", expected val = ",-2._8
!$OMP target update from(X)
write(*,'(A,F6.1,A,F6.1)')"X(1058000) =",X(1058000),", expected X(1058000) = ",0._8
 end if
 !$omp end target data
deallocate(X)
END PROGRAM
```
```
$ ./test39_target_update 
On host : X(1058000) =   -2.0, expected X(1058000) =   -2.0
running on device with 360 teams and 256 threads
is_val_assigned_on_device = T
 -
before update:
 val =  0.00E+00, expected val = undetermined
X(1058000) =  -2.0,expected X(1058000) =   -2.0
 ---
after update, val =0.0, expected val =   -2.0 <- 'TARGET UPDATE' failed
X(1058000) =   0.0, expected X(1058000) =0.0
```


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 136294] OpenMP array shaping failing test case

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136294




Summary

OpenMP array shaping failing test case




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  ravurvi20
  




The test case fails to compile due to unsupported array shaping operations in #pragma omp target update.

Testcase:

```
#pragma omp begin declare target
int do_work(double *a, int nx, int ny);
int other_work(double *a, int nx, int ny);
#pragma omp end declare target
void exch_data(double *a, int nx, int ny);
void array_ing(double *a, int nx, int ny)
{
 // map data to device and do work
 #pragma omp target data map(a[0:nx*(ny+2)])
 {
 // do work on the device
 #pragma omp target // map(a[0:nx*(ny+2)]) itional here

 do_work(a, nx, ny);

 // update boundary poins (two column 2D array) on the host 
 // pointer is shaped to 2D array using the shape-operator
 #pragma omp target update from( (([nx][ny+2])a)[0:nx][1], \
 (([nx][ny+2])a)[0:nx][ny] )

 // exchange ghost points with neighbors 
 exch_data(a, nx, ny);

 // update ghost points (two column 2D array) on the device
 // pointer is shaped to 2D array using the shape-operator
 #pragma omp target update to( (([nx][ny+2])a)[0:nx][0], \
 (([nx][ny+2])a)[0:nx][ny+1] )

 // perform other work on the device
 #pragma omp target // map(a[0:nx*(ny+2)]) is optional here
 other_work(a, nx, ny);
 }
 }
```

```
array.c:18:34: error: OpenMP array shaping operation is not allowed here
   18 |  #pragma omp target update from( (([nx][ny+2])a)[0:nx][1], \
  | ^
array.c:19:34: error: OpenMP array shaping operation is not allowed here
   19 | (([nx][ny+2])a)[0:nx][ny] )
  | ^
array.c:18:2: error: expected at least one 'to' clause or 'from' clause specified to '#pragma omp target update'
   18 |  #pragma omp target update from( (([nx][ny+2])a)[0:nx][1], \
  |  ^
array.c:26:32: error: OpenMP array shaping operation is not allowed here
   26 |  #pragma omp target update to( (([nx][ny+2])a)[0:nx][0], \
  | ^
array.c:27:32: error: OpenMP array shaping operation is not allowed here
   27 |(([nx][ny+2])a)[0:nx][ny+1] )
 |^
array.c:26:2: error: expected at least one 'to' clause or 'from' clause specified to '#pragma omp target update'
   26 |  #pragma omp target update to( (([nx][ny+2])a)[0:nx][0], \
  |  ^
6 errors generated.
```

Environment Info:
OpenMP version - 5.0/5.1/5.2
Compiler - clang version 21.0.0
[Godbolt](https://godbolt.org/z/5MKcKMePb)


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 136275] Trivial comparison of pointer defeats store->load forwarding

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136275




Summary

Trivial comparison of pointer defeats store->load forwarding




  Labels
  
llvm:analysis
  



  Assignees
  
  



  Reporter
  
  mhjacobson
  




LLVM version 20.1.1
target: x86_64-apple-darwin20.6.0

Given the following IR:

```llvm
declare ptr @calloc(i64, i64)
declare void @free(ptr)
declare void @bar(ptr noalias nocapture readonly %0, i1 %info)

define void @foo(i32 %argc, ptr nocapture readnone %argv) local_unnamed_addr {
entry:
  %0 = tail call ptr @calloc(i64 1, i64 64)
 %1 = icmp ult ptr %0, inttoptr (i64 u0x to ptr)
  store i64 0, ptr %0, align 4

  ; NOTE: the %1 here causes %0 to be considered captured and therefore potentially aliased.
  ; So the load and branch below the call can't be optimized away.
  tail call void @bar(ptr %0, i1 %1)
  %2 = load i64, ptr %0, align 4
  %3 = add nsw i64 %2, -1
  store i64 %3, ptr %0, align 4
  %4 = icmp eq i64 %2, 0
  br i1 %4, label %free_it, label %end

free_it:
  tail call void @free(ptr %0)
  ret void

end:
  ret void
}
```

I would expect the load at `%2`, the add at `%3`, the icmp at `%4`, and the branch to be optimized away.  But they aren't, because `%0` is considered captured by `llvm::DetermineUseCaptureKind()`:

```c++
  case Instruction::ICmp: {
unsigned Idx = U.getOperandNo();
unsigned OtherIdx = 1 - Idx;
if (auto *CPN = dyn_cast(I->getOperand(OtherIdx))) { ... }

// Otherwise, be conservative. There are crazy ways to capture pointers
// using comparisons.
return UseCaptureKind::MAY_CAPTURE;
```

And therefore `%0` is assumed potentially aliased.

Even passing `%1` to `@llvm.assume` (which is how I initially hit this condition) prevents the optimization.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 136298] Copying of std::vector filled with neither copy constructible nor copy assignable elements

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136298




Summary

Copying of std::vector filled with neither copy constructible nor copy assignable elements




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  Fedr
  




This program
```c++
#include 
#include 

struct A {
A() {}
A(const A&) = delete;
A(auto &&) {}
A & operator=(const A&) = delete;
};
static_assert(!std::copy_constructible);

const std::vector v(3);
// auto w{ v }; // fails both in libstdc++ and in libc++
auto w( v ); // ok in libc++
```
is rejected with `libstdc++`, but accepted with `libc++`, which looks incorrect.

Online demo: https://gcc.godbolt.org/z/v1MdbzMT1


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 136276] [BOLT] Inaccurate profile data check

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136276




Summary

[BOLT] Inaccurate profile data check




  Labels
  
BOLT
  



  Assignees
  
  



  Reporter
  
  WangJee
  




I was tring to use llvm-bolt to optimize MySQL on RISCV64, get the following warning:
BOLT-WARNING: 4411 (38.5% of all profiled) functions have invalid (possibly stale) profile. Use -report-stale to see the list.
BOLT-WARNING: 2811567572 out of 5787240884 samples in the binary (48.6%) belong to functions with invalid (possibly stale) profile.

The profile data:
1 _ZL18fsp_fill_free_listbP11fil_space_tPhP5mtr_t/1  326  1 _Z23fsp_is_system_temporaryj 0 0 11

The basic block:
.Ltmp531:
0322:   lw  a0, 0x18(s4)
0326:   auipc   ra, _Z23fsp_is_system_temporaryj
032a:   jalr-0x160(ra) # handler: 0; action: 0 # Offset: 810
032e:   bneza0, .Ltmp530 # Offset: 814

The error is caused by the fact that when acquiring the profile data, the instruction at offset 0x326 is PseudoCALL, but when performing profile verification, PseudoCALL is converted to AUIPC and jalr instructions, and the offset obtained is 0x32a; therefore, the profile data is considered invalid.




___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 136292] analyzer: Thinks result of `__builtin_mul_overflow` can be uninitialized

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136292




Summary

analyzer: Thinks result of `__builtin_mul_overflow` can be uninitialized




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  Zentrik
  




```c++
#include 

int test(size_t nel, size_t elsz) {
  size_t nbytes;
  int overflow = __builtin_mul_overflow(nel, elsz, &nbytes);
  int overflow2 = __builtin_add_overflow(nel, nbytes, &nbytes);
  return overflow * overflow2;
}
```

```
> clang++ --analyze -Xclang -analyzer-output=text -std=c++20
clang++: warning: argument unused during compilation: '-S' [-Wunused-command-line-argument]
:6:19: warning: 2nd function call argument is an uninitialized value [core.CallAndMessage]
6 |   int overflow2 = __builtin_add_overflow(nel, nbytes, &nbytes);
  | ^   ~~
:4:3: note: 'nbytes' declared without an initial value
4 |   size_t nbytes;
  | ^
:5:18: note: Assuming overflow
5 |   int overflow = __builtin_mul_overflow(nel, elsz, &nbytes);
  | ^~
:6:19: note: 2nd function call argument is an uninitialized value
6 |   int overflow2 = __builtin_add_overflow(nel, nbytes, &nbytes);
  |   ^ ~~
1 warning generated.
Compiler returned: 0
```

Godbolt: https://godbolt.org/z/4q4Efcax8


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 136315] [LLVM] llvm.fptosi.sat.* and llvm.fptoui.sat.* generate suboptimal code on PowerPC targets

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136315




Summary

[LLVM] llvm.fptosi.sat.* and llvm.fptoui.sat.* generate suboptimal code on PowerPC targets




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  johnplatts
  




The @llvm.fptosi.sat.* and @llvm.fptoui.sat.* intrinsics generate suboptimal code on PowerPC targets as demonstrated in a snippet over at https://godbolt.org/z/r6ajjTM3v.

Equivalent, more optimal alternatives on 64-bit PowerPC, POWER8, and POWER9 can be found at https://godbolt.org/z/4rezzhGna.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 136305] [clang] Show warnings for Objective-C pointers/blocks with an `assign` attribute

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136305




Summary

[clang] Show warnings for Objective-C pointers/blocks with an `assign` attribute




  Labels
  
clang
  



  Assignees
  
  



  Reporter
  
  d0iasm
  




We would like to propose showing a warning message when a developer uses an `assign` attribute to an Objective-C pointer/block when ARC is enabled with -fobjc-arc as clang treats the `assign` property as `__unsafe_unretained` which a developer may not expect.

Even though the fact that `assign` corresponds to `__unsafe_unretained` is clearly documented in [property declarations](https://clang.llvm.org/docs/AutomaticReferenceCounting.html#property-declarations), this can be surprising in a code base written in Objective-C++ (as Chromium) as this is the only qualifier that is available for pointer to non reference counted object, and thus their use for reference counted may be accidental, especially if the code is modified by a developer less familiar with Objective-C.

Finally, this warning would not prevent declaring properties that keep a non-zeroing weak pointer since it is possible to use `unsafe_unretained` to declare properties that use `__unsafe_unretained` which is clearer than the overloaded `assign` attribute.


Consider the following code:

```
@interface Bar : NSObject
  @property(nonatomic, assign) NSObject* foo1;
  @property(nonatomic, strong) NSObject* foo2;
@end

@implementation Bar
@end
```

It generates the following assembly with the command `$ clang -Wall -Wextra -fobjc-arc -o foo.s -S foo.m`:

```
"-[Bar setFoo1:]":  ## @"\01-[Bar setFoo1:]"
.cfi_startproc
## %bb.0:
pushq%rbp
 .cfi_def_cfa_offset 16
.cfi_offset %rbp, -16
movq%rsp, %rbp
 .cfi_def_cfa_register %rbp
movq%rdi, -8(%rbp)
movq%rsi, -16(%rbp)
movq%rdx, -24(%rbp)
movq-24(%rbp), %rcx
movq -8(%rbp), %rax
movq%rcx, 8(%rax)
popq%rbp
retq
 .cfi_endproc

"-[Bar setFoo2:]":  ## @"\01-[Bar setFoo2:]"
.cfi_startproc
## %bb.0:
pushq%rbp
 .cfi_def_cfa_offset 16
.cfi_offset %rbp, -16
movq%rsp, %rbp
 .cfi_def_cfa_register %rbp
subq$32, %rsp
movq%rdi, -8(%rbp)
movq%rsi, -16(%rbp)
movq%rdx, -24(%rbp)
movq -24(%rbp), %rsi
movq-8(%rbp), %rdi
addq$16, %rdi
 callq_objc_storeStrong
addq$32, %rsp
popq%rbp
 retq
.cfi_endproc
```


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 136377] [clang-tidy] bugprone-crtp-constructor-accessibility for deleted constructor

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136377




Summary

[clang-tidy] bugprone-crtp-constructor-accessibility for deleted constructor




  Labels
  
clang-tidy
  



  Assignees
  
  



  Reporter
  
  DimitrijeDobrota
  




In this basic example of CRTP pattern:

```c++
template
class base
{
  friend D;

  base() = default;
  base(const base&) = default;

  base(base&&) = delete;
};

class derived : public base
{
};
```

I get the following warning, which is expected and reasonable:
```
project_root/test.cpp:9:3: warning: deleted member function should be public [hicpp-use-equals-delete,modernize-use-equals-delete]
9 |   base(base&&) = delete;
  |   ^
```

while on the other hand, if I were to move the constructor to the public area: 

```c++
template
class base
{
  friend D;

  base() = default;
  base(const base&) = default;

public:
  base(base&&) = delete;
};

class derived : public base
{
};
```

the following warning is not needed since the constructor has been deleted:

```
project_root/testcpp:10:3: warning: public contructor allows the CRTP to be constructed as a regular template class; consider making it private [bugprone-crtp-constructor-accessibility]
 10 |   base(base&&) = delete;

```


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 136379] -Warray-bounds misses unsafe pointer arithmetic

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136379




Summary

-Warray-bounds misses unsafe pointer arithmetic




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  shuffle2
  




I would expect the following to issue a warning:
```c
void g(uint64_t a, uint64_t b) {
printf("%lx %lx\n", a, b);
}

int main(int argc, char **argv) {
uint8_t a;

// one-past the end is valid (as long as not deref'd)
g((uint64_t)&a, (uint64_t)(&a + 1));
// >1 past end is invalid
// XXX clang has -Warray-bounds, but it does not warn on the below.
// clang's -Wunsafe-buffer-usage *does* warn on it, though.
// -Wunsafe-buffer-usage doesn't seem usable in real world tho for C code. (lots of false positives).
g((uint64_t)&a, (uint64_t)(&a + 2));

 return 0;
}
```

gcc detects this as I'd expect, clang does not: https://godbolt.org/z/WEYTzMGGb

It's unclear to me if -Wunsafe-buffer-usage is the expected solution here - this flag seems unhelpful for plain C code. https://clang.llvm.org/docs/SafeBuffers.html makes it sound like the flag is mainly for use in C++ code, to detect locations that should be converted to c++-specific code patterns.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 136368] Inefficient codegen for `copysign(known_zero_sign_bit, x)`

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136368




Summary

Inefficient codegen for `copysign(known_zero_sign_bit, x)`




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  dzaima
  




The code:
```c
double native(double x) {
 double bound = fabs(x) > M_PI/2 ? M_PI/2 : 0;
return copysign(bound, x);
}
```
via `-O3 -march=haswell` generates:
```asm
.LCPI0_1:
 .quad   0x3ff921fb54442d18
.LCPI0_2:
.quad 0x7fff
native:
vmovddup qword ptr [rip + .LCPI0_2] ; 0x7fff
vandpd  xmm2, xmm0, xmm1
vmovsd  xmm3, qword ptr [rip + .LCPI0_1] ; PI/2
vcmpltsd xmm2, xmm3, xmm2
 vandpd  xmm2, xmm2, xmm3 ; xmm2 == bound
vandnpd xmm0, xmm1, xmm0
 vandpd  xmm1, xmm2, xmm1 ; unnecessary! could be just xmm2
 vorpd   xmm0, xmm1, xmm0
ret
```
which has an extraneous `vandpd` masking out the sign bit of `bound`, even though that's always 0. Moreover, manually doing the more efficient bitwise arith still results in the suboptimal code.

The better assembly would be:
```asm
vandpd xmm1, xmm0, xmmword ptr [rip + .LCPI2_0] ; extract sign
vandpd xmm0, xmm0, xmmword ptr [rip + .LCPI2_1] ; mask out sign
vmovsd xmm2, qword ptr [rip + .LCPI2_2] ; PI/2
vcmpltsd xmm0, xmm2, xmm0
 vandpd  xmm0, xmm0, xmm2
vorpd   xmm0, xmm1, xmm0
 ret
```

Compiler explorer link, plus the manual impl: https://godbolt.org/z/dv14nn39x (as an aside, there's an easily-avoidable `vmovq xmm1, xmm0` there; perhaps from the inline assembly workaround messing with register allocation?)


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 136299] Windows Release

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136299




Summary

Windows Release




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  retro-dwsz
  




Why is this kind of release missing in Clang 20?

![Image](https://github.com/user-attachments/assets/ec4ef941-1d4d-4998-bbba-4fffae372e7e)

Or is it fine to use distribution from https://github.com/mstorsjo/llvm-mingw/releases ?


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 136028] Using basic loop unrolling concept

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136028




Summary

Using basic loop unrolling concept




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  mahmoodn
  




Hi
I have written a simple loop like this
```
#include 
int main() {
int sum = 0;
#pragma clang loop unroll(full)
for (int i = 0; i < 4; i++) {
sum += i;
}
printf("Sum is %d\n", sum);
return 0;
}
```
and used 
```
clang -O0 -emit-llvm -S -Xclang -disable-O0-optnone loop.c -o loop.ll
```
to generate the IR format. The content of loop.ll file is
```
; ModuleID = 'loop.c'
source_filename = "loop.c"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

@.str = private unnamed_addr constant [11 x i8] c"Sum is %d\0A\00", align 1

; Function Attrs: noinline nounwind uwtable
define dso_local i32 @main() #0 {
  %1 = alloca i32, align 4
  %2 = alloca i32, align 4
  %3 = alloca i32, align 4
  store i32 0, ptr %1, align 4
  store i32 0, ptr %2, align 4
  store i32 0, ptr %3, align 4
  br label %4

4: ; preds = %11, %0
  %5 = load i32, ptr %3, align 4
  %6 = icmp slt i32 %5, 4
  br i1 %6, label %7, label %14

7: ; preds = %4
  %8 = load i32, ptr %3, align 4
  %9 = load i32, ptr %2, align 4
  %10 = add nsw i32 %9, %8
  store i32 %10, ptr %2, align 4
  br label %11

11: ; preds = %7
  %12 = load i32, ptr %3, align 4
  %13 = add nsw i32 %12, 1
 store i32 %13, ptr %3, align 4
  br label %4, !llvm.loop !6

14: ; preds = %4
  %15 = load i32, ptr %2, align 4
  %16 = call i32 (ptr, ...) @printf(ptr noundef @.str, i32 noundef %15)
  ret i32 0
}

declare i32 @printf(ptr noundef, ...) #1

attributes #0 = { noinline nounwind uwtable "frame-pointer"="all" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
attributes #1 = { "frame-pointer"="all" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }

!llvm.module.flags = !{!0, !1, !2, !3, !4}
!llvm.ident = !{!5}

!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{i32 8, !"PIC Level", i32 2}
!2 = !{i32 7, !"PIE Level", i32 2}
!3 = !{i32 7, !"uwtable", i32 2}
!4 = !{i32 7, !"frame-pointer", i32 2}
!5 = !{!"clang version 21.0.0git (https://github.com/llvm/llvm-project d0c973a7a0149db3b71767d4c5a20a31e6a8ed5b)"}
!6 = distinct !{!6, !7, !8}
!7 = !{!"llvm.loop.mustprogress"}
!8 = !{!"llvm.loop.unroll.full"}

``` 

Then I used
```
opt -passes='loop-unroll' -S loop.ll -o loop_unrolled.ll
```
to use the unroll pass. I expect to see four add instruction without br because the loop has been unrolled. However, The content of loop.ll and loop_unrolled.ll are exactly the same and there is no sign of "unrolled" pass. I understand that maybe the loop is so small that may not reach the requirements for llvm unrolling. But on the other hand I had specified the `#pragma` directive, so I expect that llvm is forced to unroll the loop whatever it is.

I am using clang version 21.0.0git. Any idea about that?


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 136199] Missing `strlen` in libc for offloaded target nvptx64

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136199




Summary

Missing `strlen` in libc for offloaded target nvptx64




  Labels
  
libc
  



  Assignees
  
  



  Reporter
  
  KaruroChori
  




This code https://godbolt.org/z/5oqE4Es6q fails while linking as `strlen` cannot be resolved. Still it should be a basic libc function. 

```cpp
#include 
#include 
int main(int argc, const char* argv[]){
 char str[256];
#pragma omp target map(tofrom:str)
{
 printf("%ld\n",strlen(str));
}
return 0;
}
```

Is there any good reason why strlen is not available?


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 136245] Missing offloaded libraries in `deb` distribution of llvm

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136245




Summary

Missing offloaded libraries in `deb` distribution of llvm




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  KaruroChori
  




Initially discussed in https://github.com/llvm/llvm-project/issues/136199 as it was discovered being the underlying issue.
While llvm distributed via https://apt.llvm.org/ supports OpenMP and offloading for several architectures, basic libraries like libc are not properly distributed for the offloaded targets.  



___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 136369] [clang-format] AlignAfterOpenBracket and Cpp11BracedListStyle option combination leads to inconsistent formatting

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136369




Summary

[clang-format] AlignAfterOpenBracket and Cpp11BracedListStyle option combination leads to inconsistent formatting




  Labels
  
clang-format
  



  Assignees
  
  



  Reporter
  
  mrizaln
  




Given a C++ file with following content

```cpp
struct S {
  int a;
  int b;
};

int aaa(S s);
int (S s);

namespace f {
int a(S s);
int bb(S s);
} // namespace f

int main() {
  auto a = S{ 1, 2 };
  aaa({ 1, 2 });
  ({ 1, 2 });
  f::a({ 1, 2 });
  f::bb({ 1, 2 });

  aaa({
 1,
  2,
  });

  auto i = aaa({
  .a = 1,
  .b = 2,
 });

  ({
  1,
  2,
  });

  auto j = ({
  .a = 1,
  .b = 2,
  });

  f::a({
  1,
  2,
  });

  auto k = f::a({
  .a = 1,
  .b = 2,
  });

  f::bb({
  1,
 2,
  });

  auto l = f::bb({
  .a = 1,
  .b = 2,
 });
}
```

when it is formatted using clang-format (20.1.2) with the following config file,

```yml
AlignAfterOpenBracket: BlockIndent
Cpp11BracedListStyle: false
```

the result of the formatting is inconsistent:

```cpp
struct S {
  int a;
  int b;
};

int aaa(S s);
int (S s);

namespace f {
int a(S s);
int bb(S s);
} // namespace f

int main() {
  auto a = S{ 1, 2 };
  aaa({ 1, 2 });
 ({ 1, 2 });
  f::a({ 1, 2 });
  f::bb({ 1, 2 });

  aaa({
  1,
 2,
  });

  auto i =
  aaa({
  .a = 1,
  .b = 2,
  });

  (
  {
  1,
  2,
  }
 );

  auto j = (
  {
  .a = 1,
  .b = 2,
 }
  );

  f::a(
  {
  1,
  2,
  }
  );

 auto k = f::a(
  {
  .a = 1,
  .b = 2,
  }
 );

  f::bb(
  {
  1,
  2,
  }
  );

  auto l = f::bb(
  {
  .a = 1,
  .b = 2,
  }
 );
}
```

The bug only happen specifically when setting `AlignAfterOpenBracket` to `BlockIndent` and `Cpp11BracedListStyle` to false, also happen when setting `AlignAfterOpenBracket` to `AlwaysBreak` and `Cpp11BracedListStyle` to false, though the result is different:

```cpp
struct S {
  int a;
  int b;
};

int aa(S s);
int (S s);

namespace f {
int a(S s);
int bb(S s);
} // namespace f

int main() {
  auto a = S{ 1, 2 };
  aa({ 1, 2 });
  ({ 1, 2 });
  f::a({ 1, 2 });
  f::bb({ 1, 2 });

  aa({
  1,
  2,
 });

  auto i =
  aa({
  .a = 1,
  .b = 2,
 });

  (
  {
  1,
  2,
  });

  auto j = (
  {
  .a = 1,
  .b = 2,
  });

  f::a(
 {
  1,
  2,
  });

  auto k = f::a(
  {
 .a = 1,
  .b = 2,
  });

  f::bb(
  {
 1,
  2,
  });

  auto l = f::bb(
  {
  .a = 1,
  .b = 2,
  });
}
```

The expected result should be the same as the input.
Tested on clang-format 20.1.2 (Fedora 20.1.2-3.fc42)


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 136289] [libc++] __bit reference: error: cannot add 'abi_tag' attribute in a redeclaration

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136289




Summary

[libc++] __bit reference: error: cannot add 'abi_tag' attribute in a redeclaration




  Labels
  
libc++
  



  Assignees
  
  



  Reporter
  
  T-Maxxx
  




- Clang: 20.1.3 (built from sources on tag)
- Using libc++
- C++20
- Affected by Modules feature - got error by implementing one of base classes as module.

```
In file included from /home/tmax/programs/clang-20/bin/../include/c++/v1/chrono:1009:
In file included from /home/tmax/programs/clang-20/bin/../include/c++/v1/__chrono/formatter.h:29:
In file included from /home/tmax/programs/clang-20/bin/../include/c++/v1/__chrono/ostream.h:40:
In file included from /home/tmax/programs/clang-20/bin/../include/c++/v1/sstream:323:
In file included from /home/tmax/programs/clang-20/bin/../include/c++/v1/__ostream/basic_ostream.h:27:
In file included from /home/tmax/programs/clang-20/bin/../include/c++/v1/bitset:137:
/home/tmax/programs/clang-20/bin/../include/c++/v1/__bit_reference:189:31: error: cannot add 'abi_tag' attribute in a redeclaration
  189 | _LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_HIDE_FROM_ABI __bit_iterator<_Cp, false> __copy_aligned(
  | ^
/home/tmax/programs/clang-20/bin/../include/c++/v1/__config:536:22: note: expanded from macro '_LIBCPP_HIDE_FROM_ABI'
  536 | __attribute__((__abi_tag__(_LIBCPP_TOSTRING(_LIBCPP_ODR_SIGNATURE
 | ^
/home/tmax/programs/clang-20/bin/../include/c++/v1/__bit_reference:986:67: note: previous declaration is here
  986 |   _LIBCPP_CONSTEXPR_SINCE_CXX20 friend __bit_iterator<_Dp, false> __copy_aligned(
  | ^
```
Another one
```
In file included from /home/tmax/programs/clang-20/bin/../include/c++/v1/vector:315:
In file included from /home/tmax/programs/clang-20/bin/../include/c++/v1/__vector/vector_bool.h:17:
/home/tmax/programs/clang-20/bin/../include/c++/v1/__bit_reference:189:31: error: cannot add 'abi_tag' attribute in a redeclaration
  189 | _LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_HIDE_FROM_ABI __bit_iterator<_Cp, false> __copy_aligned(
  | ^
/home/tmax/programs/clang-20/bin/../include/c++/v1/__config:536:22: note: expanded from macro '_LIBCPP_HIDE_FROM_ABI'
  536 | __attribute__((__abi_tag__(_LIBCPP_TOSTRING(_LIBCPP_ODR_SIGNATURE
 | ^
/home/tmax/programs/clang-20/bin/../include/c++/v1/__bit_reference:986:67: note: previous declaration is here
  986 |   _LIBCPP_CONSTEXPR_SINCE_CXX20 friend __bit_iterator<_Dp, false> __copy_aligned(
```


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 136302] Basic-aa incorrectly reports NoAlias for equivalent constant pointers

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136302




Summary

Basic-aa incorrectly reports NoAlias for equivalent constant pointers




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  GINN-Imp
  




Testcase: https://godbolt.org/z/5Ghx98qdY
Since both `@ptr1` and `@ptr2` point to the same constant (`@global_var`), the result should be MustAlias or at least MayAlias—not NoAlias.
```llvm
@global_var = constant i32 0
@ptr1 = constant ptr @global_var
@ptr2 = constant ptr @global_var

define void @foo() {
entry:
  %p = load ptr, ptr @ptr1, align 8
  %q = load ptr, ptr @ptr2, align 8
  ret void
}
```
opt (trunk) -aa-pipeline=basic-aa -passes='aa-eval' -print-all-alias-modref-info:
```
Function: foo: 2 pointers, 0 call sites
  NoAlias:	ptr* @ptr1, ptr* @ptr2
```

We've done our best to ensure this IR contains no undefined behavior. If there is any UB we're missing, we’d appreciate clarification.

Thanks!


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 136395] New RISC-V failure on SPEC CPU 2017 523.xalancbmk_r

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136395




Summary

New RISC-V failure on SPEC CPU 2017 523.xalancbmk_r




  Labels
  
miscompilation
  



  Assignees
  
  



  Reporter
  
  lukel97
  




This was detected on rva22u64 -O3 -flto on https://lnt.lukelau.me/db_default/v4/nts/447

Introduced somewhere before 62b9cbd8782b2ded15efed67ae10419e75ea0fa7 and after f8ea2ed59820a0bef3f23638ce7a5d10165f7109


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 136374] Arm Neoverse scheduling models have a way to large decode bandwidth (about 2x of the actual)

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136374




Summary

Arm Neoverse scheduling models have a way to large decode bandwidth (about 2x of the actual)




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  camel-cdr
  




I noticed, that the Arm Neoverse scheduling models have a way to large decoding bandwidth: https://godbolt.org/z/54hPqeqdK

I tested how many independent adds llvm-mca thinks the cores can decode per cycle and compared it with the actual decode with:

* CPU: llvm-mca vs Arm-Software-Optimization-Guide "4.1 Dispatch constraints"
* Neoverse-V1: 15 vs 8
* Neoverse-V2: 16 vs 8
* Neoverse-V3: 16 vs 10
* Neoverse-N1: 8 vs 4
* Neoverse-N2: 10 vs 5
* Neoverse-N3: 10 vs 5

The decode/issue width currently used in the scheduling models seems to correspond to the number of uops that can be processed, not MOPs, that are decoded or read from opcache.
Still, unless the cores are capable of fusing independent additions, they shouldn't be able to decode the instructions this quickly.

Here is a code snippet where the additional decode capabilities cause an impossible result: https://godbolt.org/z/GbGrKWxsq
Here the V1 can execute a loop with 13 instructions with 13 IPC, even though it should only be able to decode up to 8 instructions per cycle.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 136406] [Flang][OpenACC] Assertion `mlir::isa(baseAddr.getType()) && "expected pointer-like"' failed.

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136406




Summary

[Flang][OpenACC] Assertion `mlir::isa(baseAddr.getType()) && "expected pointer-like"' failed.




  Labels
  
mlir,
flang
  



  Assignees
  
  



  Reporter
  
  k-arrows
  




Crash itself is reproducible on godbolt. The reproducer here is reduced from https://github.com/gcc-mirror/gcc/blob/master/libgomp/testsuite/libgomp.oacc-fortran/privatized-ref-2.f90.
See https://godbolt.org/z/cxzK8Y3vs
```f90
program main
  implicit none (type, external)
  character(len=:), allocatable :: my_str

  my_str = "1234567890"
  call foo_str(my_str)
  deallocate (my_str)
contains
 subroutine foo_str(str)
integer :: i
character(len=*) :: str

 !$acc parallel copyout(str)
str = "abcdefghij"
!$acc end parallel
  end
end
```

With assertion-enabled flang, the reproducer hits the following assertion:
```txt
/path_to_project/llvm-project/flang/lib/Lower/OpenACC.cpp:164: Op createDataEntryOp(fir::FirOpBuilder &, mlir::Location, mlir::Value, std::stringstream &, mlir::SmallVector, bool, bool, mlir::acc::DataClause, mlir::Type, llvm::ArrayRef, llvm::ArrayRef, llvm::ArrayRef, bool, mlir::Value) [Op = mlir::acc::CreateOp]: Assertion `mlir::isa(baseAddr.getType()) && "expected pointer-like"' failed.
```


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 136408] [DirectX] Initializing an HLSL vector with a function call results in an assert

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136408




Summary

[DirectX] Initializing an HLSL vector with a function call results in an assert




  Labels
  
new issue,
HLSL
  



  Assignees
  
  



  Reporter
  
  farzonl
  




The assert is in CodeGenFunction::getOrCreateOpaqueRValueMapping
https://github.com/llvm/llvm-project/blob/ee4c8b556c5cf42c55ce9540bbb0e29c11894a71/clang/lib/CodeGen/CGExpr.cpp#L5645

The code to trigger the assert is below.
```hlsl
uint GetInputState(uint x) {
 return x;
} 

export uint4 fn() {
uint4 counter = { GetInputState(0), GetInputState(1), GetInputState(2), GetInputState(3) };
 return counter;
}
```

If you instead switch counter to be 
` uint4 counter = { 0,1,2,3};`

we get codegen
https://hlsl.godbolt.org/z/1TYo61nxE

## Crash dump
```gdb
  * frame #0: 0x000103eeb77c clang-dxc`clang::CodeGen::CodeGenFunction::getOrCreateOpaqueRValueMapping(this=0x00016fdf1298, e=0x00012c91a6f8) at CGExpr.cpp:5664:3
frame #1: 0x000103f87a80 clang-dxc`(anonymous namespace)::ScalarExprEmitter::VisitOpaqueValueExpr(this=0x00016fdee048, E=0x00012c91a6f8) at CGExprScalar.cpp:545:16
frame #2: 0x000103f814cc clang-dxc`clang::StmtVisitorBase::Visit(this=0x00016fdee048, S=0x00012c91a6f8) at StmtNodes.inc:196:1
frame #3: 0x000103f77b3c clang-dxc`(anonymous namespace)::ScalarExprEmitter::Visit(this=0x00016fdee048, E=0x00012c91a6f8) at CGExprScalar.cpp:449:52
frame #4: 0x000103f88fb8 clang-dxc`(anonymous namespace)::ScalarExprEmitter::VisitInitListExpr(this=0x00016fdee048, E=0x00012c91a7b8) at CGExprScalar.cpp:2143:19
frame #5: 0x000103f816e8 clang-dxc`clang::StmtVisitorBase::Visit(this=0x00016fdee048, S=0x00012c91a7b8) at StmtNodes.inc:358:1
frame #6: 0x000103f77b3c clang-dxc`(anonymous namespace)::ScalarExprEmitter::Visit(this=0x00016fdee048, E=0x00012c91a7b8) at CGExprScalar.cpp:449:52
frame #7: 0x000103f77920 clang-dxc`clang::CodeGen::CodeGenFunction::EmitScalarExpr(this=0x00016fdf1298, E=0x00012c91a7b8, IgnoreResultAssign=false) at CGExprScalar.cpp:5748:8
 frame #8: 0x000103e89e18 clang-dxc`clang::CodeGen::CodeGenFunction::EmitScalarInit(this=0x00016fdf1298, init=0x00012c91a7b8, D=0x00012c91a2b0, lvalue=LValue @ 0x00016fdeeb28, capturedByInit=false) at CGDecl.cpp:784:15
frame #9: 0x000103e91518 clang-dxc`clang::CodeGen::CodeGenFunction::EmitExprAsInit(this=0x00016fdf1298, init=0x00012c91a7b8, D=0x00012c91a2b0, lvalue=LValue @ 0x00016fdef020, capturedByInit=false) at CGDecl.cpp:2093:5
frame #10: 0x000103e8dd30 clang-dxc`clang::CodeGen::CodeGenFunction::EmitAutoVarInit(this=0x00016fdf1298, emission=0x00016fdef410) at CGDecl.cpp:2045:12
frame #11: 0x000103e885a4 clang-dxc`clang::CodeGen::CodeGenFunction::EmitAutoVarDecl(this=0x00016fdf1298, D=0x00012c91a2b0) at CGDecl.cpp:1333:3
frame #12: 0x000103e8794c clang-dxc`clang::CodeGen::CodeGenFunction::EmitVarDecl(this=0x00016fdf1298, D=0x00012c91a2b0) at CGDecl.cpp:225:10
frame #13: 0x000103e870bc clang-dxc`clang::CodeGen::CodeGenFunction::EmitDecl(this=0x00016fdf1298, D=0x00012c91a2b0, EvaluateConditionDecl=true) at CGDecl.cpp:166:5
 frame #14: 0x0001041f0304 clang-dxc`clang::CodeGen::CodeGenFunction::EmitDeclStmt(this=0x00016fdf1298, S=0x00012c91a830) at CGStmt.cpp:1674:5
frame #15: 0x0001041e6bbc clang-dxc`clang::CodeGen::CodeGenFunction::EmitSimpleStmt(this=0x00016fdf1298, S=0x00012c91a830, Attrs=ArrayRef @ 0x00016fdef6f0) at CGStmt.cpp:515:5
frame #16: 0x0001041e5874 clang-dxc`clang::CodeGen::CodeGenFunction::EmitStmt(this=0x00016fdf1298, S=0x00012c91a830, Attrs=ArrayRef @ 0x00016fdef860) at CGStmt.cpp:65:7
frame #17: 0x0001041f1958 clang-dxc`clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(this=0x00016fdf1298, S=0x00012c91c2e0, GetLast=false, AggSlot=AggValueSlot @ 0x00016fdefb38) at CGStmt.cpp:622:7
frame #18: 0x0001041f021c clang-dxc`clang::CodeGen::CodeGenFunction::EmitCompoundStmt(this=0x00016fdf1298, S=0x00012c91c2e0, GetLast=false, AggSlot=AggValueSlot @ 0x00016fdefcc8) at CGStmt.cpp:573:10
frame #19: 0x0001041e6ba4 clang-dxc`clang::CodeGen::CodeGenFunction::EmitSimpleStmt(this=0x00016fdf1298, S=0x00012c91c2e0, Attrs=ArrayRef @ 0x00016fdefc80) at CGStmt.cpp:512:5
frame #20: 0x0001041e5874 clang-dxc`clang::CodeGen::CodeGenFunction::EmitStmt(this=0x00016fdf1298, S=0x00012c91c2e0, Attrs=ArrayRef @ 0x00016fdefdf0) at CGStmt.cpp:65:7
frame #21: 0x0001041e73f8 clang-dxc`clang::CodeGen::CodeGenFunction::EmitIfStmt(this=0x00016fdf1298, S=0x00012c91c340) at CGStmt.cpp:974:5
frame #22: 0x0001041e5b64 cl

[llvm-bugs] [Bug 136409] [DirectX] `createTypedBufferLoad` trys to replace a vector load with a scalar float extractValue insttruction in DXILResourceAccess.cpp

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136409




Summary

[DirectX] `createTypedBufferLoad` trys to replace a vector load with a scalar float extractValue insttruction in DXILResourceAccess.cpp




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  farzonl
  




if we need to replace the load and the types are different we cant use `replaceAllUsesWith`.

## Location of Assert
https://github.com/llvm/llvm-project/blob/ee4c8b556c5cf42c55ce9540bbb0e29c11894a71/llvm/lib/Target/DirectX/DXILResourceAccess.cpp#L146

## Type differences
```gdb
expr V->dump()
  %32 = extractvalue { float, i1 } %31, 0
(lldb) expr *V->getType()
(llvm::Type) $1 = {
  Context = 0x00013f8229c0
  ID = FloatTyID
  SubclassData = 0
  NumContainedTys = 0
  ContainedTys = nullptr
}

(lldb) expr LI->dump()
  %33 = load <1 x float>, ptr %30, align 4
(lldb) expr *LI->getType()
(llvm::Type) $2 = {
 Context = 0x00013f8229c0
  ID = FixedVectorTyID
  SubclassData = 0
 NumContainedTys = 1
  ContainedTys = 0x00012904ad10
```

## Crash Dump
```gdb
* thread #1, queue = 'com.apple.main-thread', stop reason = instruction step into
  * frame #0: 0x000102054088 clang-dxc`llvm::Value::replaceAllUsesWith(this=0x000128850150, New=0x00016fdf3ea8) at Value.cpp:533
frame #1: 0x000100110174 clang-dxc`createTypedBufferLoad(II=0x000128850150, LI=0x00012883b2d0, Offset=0x, RTI=0x000129013408) at DXILResourceAccess.cpp:146:7
frame #2: 0x00010010ef38 clang-dxc`createLoadIntrinsic(II=0x000128850150, LI=0x00012883b2d0, Offset=0x, RTI=0x000129013408) at DXILResourceAccess.cpp:167:12
frame #3: 0x00010010e0a4 clang-dxc`replaceAccess(II=0x000128850150, RTI=0x000129013408) at DXILResourceAccess.cpp:224:7
frame #4: 0x00010010d9c8 clang-dxc`transformResourcePointers(F=0x000128804ac8, DRTM=0x000128814740) at DXILResourceAccess.cpp:249:5
frame #5: 0x000100110b04 clang-dxc`(anonymous namespace)::DXILResourceAccessLegacy::runOnFunction(this=0x00012883e150, F=0x000128804ac8) at DXILResourceAccess.cpp:278:12
frame #6: 0x000101f5feb4 clang-dxc`llvm::FPPassManager::runOnFunction(this=0x000128859e70, F=0x000128804ac8) at LegacyPassManager.cpp:1406:27
frame #7: 0x000101f66a48 clang-dxc`llvm::FPPassManager::runOnModule(this=0x000128859e70, M=0x00012ee0d2e0) at LegacyPassManager.cpp:1452:16
frame #8: 0x000101f60774 clang-dxc`(anonymous namespace)::MPPassManager::runOnModule(this=0x00012880e360, M=0x00012ee0d2e0) at LegacyPassManager.cpp:1521:27
frame #9: 0x000101f602e8 clang-dxc`llvm::legacy::PassManagerImpl::run(this=0x000129010600, M=0x00012ee0d2e0) at LegacyPassManager.cpp:539:44
frame #10: 0x000101f66e50 clang-dxc`llvm::legacy::PassManager::run(this=0x00016fdf47e8, M=0x00012ee0d2e0) at LegacyPassManager.cpp:1648:14
frame #11: 0x000103b41860 clang-dxc`(anonymous namespace)::EmitAssemblyHelper::RunCodegenPipeline(this=0x00016fdf4c60, Action="" OS=llvm::raw_pwrite_stream @ 0x00012ee0cfb0, DwoOS=nullptr) at BackendUtil.cpp:1244:19
frame #12: 0x000103b32000 clang-dxc`(anonymous namespace)::EmitAssemblyHelper::emitAssembly(this=0x00016fdf4c60, Action="" OS=llvm::raw_pwrite_stream @ 0x00012ee0cfb0, BC=0x00012ee0d160) at BackendUtil.cpp:1268:3
frame #13: 0x000103b3151c clang-dxc`clang::emitBackendOutput(CI=0x00013f8235b0, CGOpts=0x00014002be18, TDesc=(Data = "" Length = 78), M=0x00012ee0d2e0, Action="" VFS=IntrusiveRefCntPtr @ 0x00016fdf4f68, OS=nullptr, BC=0x00012ee0d160) at BackendUtil.cpp:1433:13
frame #14: 0x0001042aac7c clang-dxc`clang::BackendConsumer::HandleTranslationUnit(this=0x00012ee0d160, C=0x00012f01c600) at CodeGenAction.cpp:316:3
frame #15: 0x000106aca0b8 clang-dxc`clang::ParseAST(S=0x00012f03d800, PrintStats=false, SkipFunctionBodies=false) at ParseAST.cpp:184:13
frame #16: 0x000104e2175c clang-dxc`clang::ASTFrontendAction::ExecuteAction(this=0x00013f829060) at FrontendAction.cpp:1345:3
frame #17: 0x0001042b06f0 clang-dxc`clang::CodeGenAction::ExecuteAction(this=0x00013f829060) at CodeGenAction.cpp::30
frame #18: 0x000104e20fd4 clang-dxc`clang::FrontendAction::Execute(this=0x00013f829060) at FrontendAction.cpp:1227:3
frame #19: 0x000104d3ac18 clang-dxc`clang::CompilerInstance::ExecuteAction(this=0x00013f8235b0, Act=0x00013f829060) at CompilerInstance.cpp:1056:33
frame #20: 0x000104f5dbf4 clang-dxc`clang::ExecuteCompilerInvocation(Clang=0x00013f8235b0) at ExecuteCompilerInvocation.cpp:300:25
frame #21: 0x000100014a60 clang-dxc`cc1_main(Argv=ArrayRef @ 0x00016fdf5ea8, Argv0="/Users/farzonlotfi/Projects/llvm_debug_build/bin/clang-20", M

[llvm-bugs] [Bug 136410] [LLDB] An inconsistency between step-by-step debugging and breakpoint debugging at O2

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136410




Summary

[LLDB] An inconsistency between step-by-step debugging and breakpoint debugging at O2




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  Apochens
  




Clang version:

```
Ubuntu clang version 21.0.0 (++20250415033808+d0e4af8a88dc-1~exp1~20250415153924.2354)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/lib/llvm-21/bin
```

LLDB version: `lldb version 21.0.0`

Inconsistency-triggering program `test.c`:
```c
int printf(const char *, ...);
int a[2][9] = {
  {0, 1, 2, 3, 4, 5, 6, 7, 8},
  {0, 1, 2, 3, 4, 5, 6, 7, 8}
};
int main(int argc, char* argv[]) {
  int b, c, d = 0;
  if (argc == 0)
d = 1;
  for (b = 0; b < 2; b++) {
for (c = 0; c < 9; c++) {
  printf("%d\n", a[b][c]);   // set breakpoint at this line
  if (d)
printf("index\n");
}
 }
}
```
Compilation command: `clang -g -O2 test.c -o test`

When debugging the program step-by-step (i.e., `s`), the debugger could stop at the specified line, while the debugger cannot stop at the line when directly setting the breakpoint (i.e., `b 12`).

@jimingham Apologies for bothering you again. Would the root cause of this inconsistency be identical to #136089 ?


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 136344] [HLSL] Fix failing SPIR-V backend tests that specify --target-env vulkan1.3

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136344




Summary

[HLSL] Fix failing SPIR-V backend tests that specify --target-env vulkan1.3




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  kmpeng
  




## Problem
When SPIRV-Tools is enabled, tests that specify the target environment `vulkan1.3` in the validation step fail.

## Temporary fix
This PR is a stop gap to get the pipeline green again:
https://github.com/llvm/llvm-project/pull/136343

## Long term fix
The current suggestion from @s-perron is to change the tests to not have external functions or variables, e.g. making the functions shader entry points. More discussion is likely needed.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 136342] [LLVM] llvm.fptosi.sat.* and llvm.fptoui.sat.* generate suboptimal code in some cases on x86

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136342




Summary

[LLVM] llvm.fptosi.sat.* and llvm.fptoui.sat.* generate suboptimal code in some cases on x86




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  johnplatts
  




Here is a snippet that demonstrates suboptimal code generation for llvm.fptosi.sat.* and llvm.fptoui.sat.* on x86 with SSE2 or AVX512F enabled but not with AVX10.2 enabled in the case where floating point exceptions are ignored and masked:


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 136324] [mlir] LinalgOps::regionBuilder function should return ParseResult

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136324




Summary

[mlir] LinalgOps::regionBuilder function should return ParseResult




  Labels
  
mlir
  



  Assignees
  
  



  Reporter
  
  hiraditya
  




The current signature of regionBuilder functions is 
```
using RegionBuilderFn = llvm::function_ref)>;
```

This makes it impossible to update the caller if anything went wrong (e.g., https://github.com/llvm/llvm-project/issues/132740). I'm curious to know if there are any objections to modifying this signature.

cc: @banach-space @javedabsar 


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 136347] `IRNormalizer`'s registry name `normalize` mismatches documentation

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136347




Summary

`IRNormalizer`'s registry name `normalize` mismatches documentation




  Labels
  
  



  Assignees
  
  



  Reporter
  
  mhjacobson
  




commit 2e9f8696e9533fdd464e025bd504302fa1a22f14 introduced a pass `IRNormalizer`, whose name in PassRegistry.def is `normalize`.  However, the documentation was updated to state:

> ir-normalizer: Transforms IR into a normal form that’s easier to diff

The docs should match the implementation.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 136353] [WebAssembly] Incorrect stackification of effectful instructions

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136353




Summary

[WebAssembly] Incorrect stackification of effectful instructions




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  SingleAccretion
  




Reading the stackification code, I came across this in `WebAssemblyRegStackify.cpp / static void query`:
```
// These instructions have hasUnmodeledSideEffects() returning true
// because they trap on overflow and invalid so they can't be arbitrarily
// moved, however in the specific case of register stackifying, it is safe
// to move them because overflow and invalid are Undefined Behavior.
```
Which prompted me to think about this this example:
```llvm
target datalayout = "e-m:e-p:32:32-p10:8:8-p20:8:8-i64:64-n32:64-S128-ni:1:10:20"
target triple = "wasm32-unknown-wasi"

declare i32 @extern_func(i32, i32)

define i32 @func(i32 %0, i32 %1) {
  %call_value = call i32 @extern_func(i32 %0, i32 %1)
  %div = udiv i32 %0, %1
  %sub = sub i32 %div, %call_value
  ret i32 %sub
}
```
Running this through `llc` we get:
```
> llc test.ll -o test.o --filetype=obj -O1 && wasm-objdump -d test.o

test.o: file format wasm 0x1

Code Disassembly:

55 func[1] :
 56: 20 00  | local.get 0
 58: 20 01 | local.get 1
 5a: 6e | i32.div_u
 5b: 20 00  | local.get 0
 5d: 20 01 | local.get 1
 5f: 10 80 80 80 80 00  | call 0 
 65: 6b | i32.sub
 66: 0b | end
```
This `udiv` and the call get swapped. This does not look correct, since we now have _introduced UB_ (a trap) under the following conditions:
1) The divisor (`%1`) is zero.
2) `extern_func` throws (either a WASM exception or a JS exception).


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 136357] [flang][OpenMP] flaky firstprivate/lastprivate behavior due to misplaced barriers

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136357




Summary

[flang][OpenMP] flaky firstprivate/lastprivate behavior due to misplaced barriers




  Labels
  
flang
  



  Assignees
  
  



  Reporter
  
  eugeneepshteyn
  




(Thanks goes to @ebaskakov for doing the actual investigation.)

The variables in firstprivate/lastprivate don't get set properly in some cases, resulting in flaky behavior. 

Consider the following test:
```
  implicit none
 integer :: i, n
  logical :: first
  first = .true.
  n = 42
  !$omp parallel do firstprivate(n) firstprivate(first) lastprivate(n)
do i=1,10
  if (first) then
if (n/=42) stop -1*n
first = .false.
  end if
  n = 100
end do
  !$omp end parallel do
 if (n/=100) stop -2*n
  print *,'passed'
end
```

Using the following flang on x86_64:
```
$ flang --version
flang version 21.0.0git (https://github.com/eugeneepshteyn/llvm-project.git 31ddaef8d18d643ff4c343d03ddfe2edae7d22a2)
Target: x86_64-unknown-linux-gnu
Thread model: posix
Build config: +unoptimized, +assertions
```
... results in the following behavior:
```
$ ./a.out 
Fortran STOP: code -100

Fortran STOP: code -100

Fortran STOP: code -100

$ ./a.out 
Fortran STOP: code -100

$ ./a.out 
Fortran STOP: code -100

$ ./a.out 
Fortran STOP: code -100

Fortran STOP: code -100

$ ./a.out 
 passed
```
It seems that `n` is already set to 100, while it should still have the value of 42.

Looking at LLVM IR output of `flang -fopenmp -g -O0 -S -emit-llvm firstprivate.f90`:

Load `first` and `n` passed as the structure of two fields:
```
define internal void @_QQmain..omp_par(ptr noalias %tid.addr, ptr noalias %zero.addr, ptr %0) #1 !dbg !26 {
omp.par.entry:
  %gep_ = getelementptr { ptr, ptr }, ptr %0, i32 0, i32 0
  %loadgep_ = load ptr, ptr %gep_, align 8, !align !29
  %gep_1 = getelementptr { ptr, ptr }, ptr %0, i32 0, i32 1
  %loadgep_2 = load ptr, ptr %gep_1, align 8, !align !29
...
```
Bbarrier:
```
omp.par.region1: ; preds = %omp.par.region
 %omp_global_thread_num2 = call i32 @__kmpc_global_thread_num(ptr @4)
  call void @__kmpc_barrier(ptr @3, i32 %omp_global_thread_num2)
  br label %omp.private.init, !dbg !30

omp.private.init: ; preds = %omp.par.region1
  br label %omp.private.copy, !dbg !30
```
... but then the values for `first` and `n` are loaded via `loadgep_*` pointers:
```
omp.private.copy: ; preds = %omp.private.init
  %2 = load i32, ptr %loadgep_, align 4, !dbg !31
  store i32 %2, ptr %omp.private.alloc, align 4, !dbg !31
  %3 = load i32, ptr %loadgep_2, align 4, !dbg !32
  store i32 %3, ptr %omp.private.alloc4, align 4, !dbg !32
  br label %omp.wsloop.region, !dbg !32

```
Since the load happens after the barrier, some threads could have already set the value of `n` to 100.

It seems that the loads of the values from the passed structure should happen before the barrier.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 136358] Implement `sigsetjmp` and `siglongjmp`

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136358




Summary

Implement `sigsetjmp` and `siglongjmp`




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  SchrodingerZhu
  




- linux
  - [ ] x86-64
  - [ ] aarch64
  - [ ] riscv
- macos
   - [ ] x86-64
   - [ ] aarch64


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 136105] [clang-tidy] False positive bugprone-use-after-move with std::tie

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136105




Summary

[clang-tidy] False positive bugprone-use-after-move with std::tie




  Labels
  
clang-tidy,
false-positive
  



  Assignees
  
  



  Reporter
  
  dmpolukhin
  




Minimal reproducer https://godbolt.org/z/PoboWj4PM:
```c++
#include 
#include 
#include 

std::pair foo(std::string a, std::string b) {
  return std::make_pair(std::move(a), std::move(b));
}

void bar(std::string a, std::string b) {
  std::tie(a, b) = foo(std::move(a), std::move(b));
 std::tie(a, b) = foo(std::move(a), std::move(b));
}

/**
test.cpp:11:34: warning: 'a' used after it was moved [bugprone-use-after-move]
   11 | std::tie(a, b) = foo(std::move(a), std::move(b));
  | ^
test.cpp:10:24: note: move occurred here
   10 | std::tie(a, b) = foo(std::move(a), std::move(b));
  | ^
test.cpp:11:48: warning: 'b' used after it was moved [bugprone-use-after-move]
   11 |   std::tie(a, b) = foo(std::move(a), std::move(b));
  | ^
test.cpp:10:38: note: move occurred here
   10 |   std::tie(a, b) = foo(std::move(a), std::move(b));
  | ^
**/
```




___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 136373] Placeholder used in parameter name should conflict with placeholder used for automatic variable

2025-04-18 Thread LLVM Bugs via llvm-bugs


Issue

136373




Summary

Placeholder used in parameter name should conflict with placeholder used for automatic variable




  Labels
  
  



  Assignees
  
cor3ntin
  



  Reporter
  
  shafik
  




Given [CWG3005](http://wg21.link/cwg3005) we should reject the following:

```cpp
void f(int _) {
int _; // error
}
```

godbolt: https://godbolt.org/z/1fa4njchc


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs