[llvm-bugs] [Bug 143472] BOLT instrument failed on Aarch64

2025-06-09 Thread LLVM Bugs via llvm-bugs


Issue

143472




Summary

BOLT instrument failed on Aarch64




  Labels
  
BOLT
  



  Assignees
  
  



  Reporter
  
  qianwindz
  




![Image](https://github.com/user-attachments/assets/2de973b4-a0d6-41a2-8ce0-e7847d9bc44f)
I compiled bolt on the architecture of aarch64 to optimize the shared libraries of arrch64. At present, I am unable to use perf to obtain data, so I used llvm-bolt- instrument - o, but the above error occurred. May I ask what the reason is for this? Is there a solution?


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 143469] Compiler extension for Rust-like using syntax

2025-06-09 Thread LLVM Bugs via llvm-bugs


Issue

143469




Summary

Compiler extension for Rust-like using syntax




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  mikomikotaishi
  




I do not know if requests/suggestions for compiler extensions are accepted, but I would like to suggest adding Rust-style `using` statements as a language extension for Clang. What I mean by that is, in Rust the `use` statement has much more versatility and ergonomics than in C++. Example:
```rust
use std::env::Args;
use std::fs::{File, FileType};
use std::io; 
use std::os::{linux::{fs::MetadataExt, process::PidFd}, unix::net::Incoming, windows::{io::BorrowedSocket, ffi::EncodeWide}};
use std::thread::*;
```
If we wanted to load the same symbols into scope in C++, the equivalent code in ISO C++ (if C++ had the same standard library structure as Rust), would be:
```cpp
using std::env::Args;
using std::fs::File;
using std::fs::FileType;
namespace io = std::io;
using std::os::linux::fs::MetadataExt;
using std::os::linux::process::PidFd;
using std::os::unix::net::Incoming;
using std::os::windows::io::BorrowedSocket;
using std::os::windows::ffi::EncodeWide;
using namespace std::thread;
```
The proposed syntax in C++ (again assuming the C++ standard library was equivalent to the Rust standard library), would instead be:
```rust
using std::env::Args;
using std::fs::{File, FileType};
using std::io; 
using std::os::{linux::{fs::MetadataExt, process::PidFd}, unix::net::Incoming, windows::{io::BorrowedSocket, ffi::EncodeWide}};
using std::thread::*;
```
For instance, if we were using the actual C++ standard library, we could instead write `using std::chrono::*;` instead of `using namespace std::chrono;`. Or, instead being able to write `using std::ranges;` to do:
```cpp
using std::ranges;

// We can write ranges::min() instead of std::ranges::min()
int minimum = ranges::min({33, 54, 13, 802, 7, 61});
```
Again, I do not know if this is an appropriate place to request/suggest compiler/language extensions, but such a feature would be greatly beneficial for improving ergonomics especially as `using` statements are much more viable with the addition of modules to C++, which do not export `using` statements by default unlike headers.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 143473] [libcxxabi][llvm-cxxfilt] Demangler produces bad output for templated `operator<`

2025-06-09 Thread LLVM Bugs via llvm-bugs


Issue

143473




Summary

[libcxxabi][llvm-cxxfilt] Demangler produces bad output for templated `operator<`




  Labels
  
  



  Assignees
  
  



  Reporter
  
  jeremy-rifkin
  




llvm-cxxfilt and __cxa_demangle demangle `_ZltI1SEvT_S1_` as `void operator<(S, S)`. Specifically, the `operator<` part can be problematic for tooling attempting to parse these names since this should be tokenized as `operator` `<<` `S` `>`. While possible to disambiguate some cases, I think it would be better to demangle this specific case as `void operator< (S, S)`.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 143454] [libc][errno] Deprecate `LIBC_ERRNO_MODE_SYSTEM` in favor of `LIBC_ERRNO_MODE_SYSTEM_INLINE`.

2025-06-09 Thread LLVM Bugs via llvm-bugs


Issue

143454




Summary

[libc][errno] Deprecate `LIBC_ERRNO_MODE_SYSTEM` in favor of `LIBC_ERRNO_MODE_SYSTEM_INLINE`.




  Labels
  
libc
  



  Assignees
  
  



  Reporter
  
  lntue
  




In `LIBC_ERRNO_MODE_SYSTEM`, we still create a temporary global object `libc_errno` and then point it to the system libc `errno`.  To use system libc `errno`, it would be more efficient to make `libc_errno` a macro defined as `errno`, just as `LIBC_ERRNO_MODE_SYSTEM_INLINE`.  It also allow us to skip linking against `libc.src.errno.errno` target.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 143456] [AVX2] SAD pattern detection is too strict

2025-06-09 Thread LLVM Bugs via llvm-bugs


Issue

143456




Summary

[AVX2] SAD pattern detection is too strict




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  adworacz
  




Reference code: [Zig Godbolt](https://github.com/llvm/llvm-project/commit/6f879d9eb1a111a0c99f2a69e4ad30b220f4926a)

Some opportunities for producing optimized sum of absolute differences (SAD) calculations are being missed. It looks like [prior support for this was overly restrictive](https://github.com/llvm/llvm-project/commit/6f879d9eb1a111a0c99f2a69e4ad30b220f4926a).

Essentially, the absolute difference is being calculated, when it should just be handled by the dedicated SAD instruction.

Here's the code inline:

```zig
const block_width = 8;
const T = u8;
const VT = @Vector(block_width, T);

export fn sad(noalias srcp: [*]const u8, noalias refp: [*]const u8, height: usize, stride: usize) u32 {
const src = "" * stride];
const ref = refp[0..height * stride];

var sum: u32 = 0;

const s: VT = src[0*stride..][0..block_width].*;
const r: VT = ref[0*stride..][0..block_width].*;

// Should work, but doesn't.
 //const absdiff = @max(s,r) - @min(s,r);
//sum += @reduce(.Add, absdiff);

// Should work, but doesn't
//const VTI = @Vector(block_width, i16);
//sum += @reduce(.Add, @abs(@as(VTI, s) - @as(VTI, r)));

// Does work
const VTI = @Vector(block_width, i32);
sum += @reduce(.Add, @abs(@as(VTI, s) - @as(VTI, r)));

 return sum;
}
```

Which produces:

```asm
sad:
pushrbp
 mov rbp, rsp
vmovq   xmm0, qword ptr [rdi]
vmovq xmm1, qword ptr [rsi]
vpminub xmm2, xmm0, xmm1
vpmaxub xmm0, xmm0, xmm1
vpxor   xmm1, xmm1, xmm1
vpsubb  xmm0, xmm0, xmm2
vpsadbw xmm0, xmm0, xmm1
vpextrb eax, xmm0, 0
 pop rbp
ret
```

But it should be:

```asm
sad:
 pushrbp
mov rbp, rsp
vmovq   xmm0, qword ptr [rdi]
vmovq   xmm1, qword ptr [rsi]
vpsadbw xmm0, xmm0, xmm1
vmovd   eax, xmm0
pop rbp
ret
```


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 143440] [analyzer] Static Analysis runs out of memory on a tiny test-case

2025-06-09 Thread LLVM Bugs via llvm-bugs


Issue

143440




Summary

[analyzer] Static Analysis runs out of memory on a tiny test-case




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  wjristow
  




The test-case shown below demonstrates a problem with Static Analysis running out of memory.  This is reduced from a large C++20 test-case reported to us, although the reduced test-case here is valid C++17 (or C++20) code.

I've tested with a modern compiler from `main`: 5d6218d311854a0b5d48ae19636f6abe1e67fc69 (llvmorg-21-init-14792-g5d6218d31185).

Setting a 60GB virtual memory limit:
```
$ ulimit -v 6000
$
```

shows the problem:
```
$ clang++ --version | grep ^clang
clang version 21.0.0git (https://github.com/llvm/llvm-project.git 5d6218d311854a0b5d48ae19636f6abe1e67fc69)
$ time clang++ --analyze -std=c++17 test.cpp
LLVM ERROR: out of memory
Allocation failed
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.  Program arguments: clang++ --analyze -std=c++17 test.cpp
1.   parser at end of file
2.  While analyzing stack:
#0 Calling Class3::operator=(const Class3 &) at line 77
 #1 Calling Class4::Struct8::operator=(const Struct8 &) at line 89
 #2 Calling Class4::Struct8::m_f2(const Struct8 &) at line 90
#3 Calling Class4::Struct10::m_f3(const Struct10 &) at line 101
#4 Calling Class4::m_f4()
3.  test.cpp:56:7: Error evaluating statement
 #0 0x55b69b1027bf llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/warren/llvm/current/bin/clang+++0x47647bf)
 #1 0x55b69b1002b4 llvm::sys::CleanupOnSignal(unsigned long) (/home/warren/llvm/current/bin/clang+++0x47622b4)
 #2 0x55b69b03f538 CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
 #3 0x7fccefa11420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
...
#35 0x55b697df4672 clang_main(int, char**, llvm::ToolContext const&) (/home/warren/llvm/current/bin/clang+++0x1456672)
#36 0x55b697c93e9b main (/home/warren/llvm/current/bin/clang+++0x12f5e9b)
#37 0x7fccef4dd083 __libc_start_main /build/glibc-B3wQXB/glibc-2.31/csu/../csu/libc-start.c:342:3
#38 0x55b697dee72e _start (/home/warren/llvm/current/bin/clang+++0x145072e)
clang++: error: clang frontend command failed with exit code 134 (use -v to see invocation)
clang version 21.0.0git (https://github.com/llvm/llvm-project.git 5d6218d311854a0b5d48ae19636f6abe1e67fc69)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/warren/llvm/current/bin
Build config: +assertions
clang++: note: diagnostic msg:


PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang++: note: diagnostic msg: /tmp/test-59716f.cpp
clang++: note: diagnostic msg: /tmp/test-59716f.sh
clang++: note: diagnostic msg:


1m11.19s real 0m52.79s user 0m17.48s system
$
```
As noted in the comments in the test-case, suppressing various (unused) members results in the static analysis completing using about 4 GB.  Suppressing one particular unused member makes it complete essentially immediately (not using much memory at all):
```
$ time clang++ --analyze -std=c++17 -DUSE_4GB test.cpp
0m57.16s real 0m54.30s user 0m02.84s system
$ time clang++ --analyze -std=c++17 -DAVOID_PROBLEM test.cpp
0m00.16s real 0m00.02s user 0m00.03s system
$

```

I've bisected when this issue for this reduced test-case appeared to 6194229c6287fa5d8a9a30aa489bcbb8a9754ae0 (llvmorg-16-init-8141-g6194229c6287).
Pinging @tomasz-kaminski-sonarsource 

I've verified that if I revert that commit in `main` (and also the immediately preceding commit, which that commit depends on), then my reduced test-case here does work fine.  But the original full test-case still runs out of memory with that llvm16-era work reverted from the head of `main`.  (I can't easily try the original full test-case with a compiler from that llvm16-era, because that test-case is C++20 code, and it uses C++20 constructs that weren't supported by Clang back then.)  And of course I'm not suggesting that reverting that work is the right thing to do -- I'm just pointing this out for reference.

In any case, here is the reduced test-case:

```
// When running the Static Analyzer on this test-case, in C++17 mode, it runs
// out of memory in a few minutes:
//  clang++ --analyze -std=c++17 test.cpp
// As an aside, in C++20 mode, it takes a very long time (more than an hour),
// and uses a massive amount of memory (around 55GB), but it does eventually
// complete.
//
// If any of the lines declaring `x1`. `x2`. `x3`, ... `x9` are commented out,
// then the `--analyze` run completes.  For most of them, t

[llvm-bugs] [Bug 143425] [flang] Implicit ASYNCHRONOUS attribute constraint is not diagnosed.

2025-06-09 Thread LLVM Bugs via llvm-bugs


Issue

143425




Summary

[flang] Implicit ASYNCHRONOUS attribute constraint is not diagnosed.




  Labels
  
flang:frontend
  



  Assignees
  
  



  Reporter
  
  DanielCChen
  




Consider the following code:
```

INTERFACE
SUBROUTINE ExplicitShapeArray(ioUnit, arrayShapeExplicit)
integer :: ioUnit
complex, asynchronous, dimension( 10 ) :: arrayShapeExplicit
END SUBROUTINE ExplicitShapeArray
END INTERFACE

complex, pointer, dimension( : ) :: ptrComplexArray
 complex, target, dimension( 10 ) :: complexArray


ptrComplexArray => complexArray
open(1226, asynchronous='yes', action=""
 access='stream', form='unformatted')


do i = 1, 10
 read(1226, asynchronous='yes') complexArray
end do


call ExplicitShapeArray(1226, ptrComplexArray)


close( 1226 )

END


SUBROUTINE ExplicitShapeArray(ioUnit, arrayShapeExplicit)
 integer :: ioUnit
complex, asynchronous, dimension( 10 ) :: arrayShapeExplicit


END SUBROUTINE ExplicitShapeArray
```

This code violates Constraint C1549
```
C1549 (R1524) If an actual argument is an array pointer that has the ASYNCHRONOUS or VOLATILE
attribute but does not have the CONTIGUOUS attribute, and the corresponding dummy argument
has either the ASYNCHRONOUS or VOLATILE attribute, but does not have the VALUE attribute,
that dummy argument shall be an array pointer, an assumed-shape array without the CONTIGUOUS
attribute, or an assumed-rank entity without the CONTIGUOUS attribute.
```

`ptrComplexArray` is implicitly given the `asynchronous` attribute, so the call to `ExplicitShapeArray` should be diagnosed.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 143397] Incorrect optimization of multiple __builtin_unreachable() conditions leads to logic errors in control flow

2025-06-09 Thread LLVM Bugs via llvm-bugs


Issue

143397




Summary

Incorrect optimization of multiple __builtin_unreachable() conditions leads to logic errors in control flow




  Labels
  
  



  Assignees
  
  



  Reporter
  
  hutuhutong
  




this problems exist in X86_64 clang 18/20。
clang exhibits incorrect optimization behavior when handling multiple (>2) __builtin_unreachable() statements.

When only a single __builtin_unreachable() is used, or when compiling with -O0, the code executes correctly. However, when two or more __builtin_unreachable() statements are present, the program behaves correctly under -O0, but under -O1/O2/O3/Os, clang incorrectly folds the entire test_builtin_unreachable() function, leading to an infinite loop at runtime.

This suggests that the optimizer does not correctly account for the interactions of multiple unreachable paths during optimization.

the code
#include 
#include 

void test_output() {
printf("the code is executing\n");
}

void test_builtin_unreachable() {
int bb = 2;
if ((bb & ~3) != 0)
__builtin_unreachable();
if ((bb & 1) == 0)
__builtin_unreachable();
if (bb == 2)
printf("the value of bb is: %d\n", bb);
}

int main() {
test_output();
test_builtin_unreachable();
return 0;
}

the  output
$ clang -O0 test.c -o test
$ ./test
the code is executing
the value of bb is: 2

$ clang -O1 test.c -o test
$ ./test
the code is executing
the code is executing
Segmentation fault (core dumped)

the assembly code=
when use the -O1, we can see the function test_builtin_unreachable is none, so the test_output always be executing:

test_output:
lea rdi, [rip + .Lstr]
jmp puts@PLT

test_builtin_unreachable:

main:
pushrax
lea rdi, [rip + .Lstr]
callputs@PLT

.Lstr:
.asciz  "the code is executing"


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 143378] [lldb] Extend lldb-test to emit Symtab data as well.

2025-06-09 Thread LLVM Bugs via llvm-bugs


Issue

143378




Summary

[lldb] Extend lldb-test to emit Symtab data as well.




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  DhruvSrivastavaX
  




Right now, lldb-test can only emit data upto Binary file sections. 
We can extend it to emit symbol table data as well, preferably with the use of an option. 



___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 143414] [IndVarSimplify] LFTR Narrows IV to an Odd Sized Integer

2025-06-09 Thread LLVM Bugs via llvm-bugs


Issue

143414




Summary

[IndVarSimplify] LFTR Narrows IV to an Odd Sized Integer




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  veera-efficient
  




Godbolt: https://c.godbolt.org/z/MejP96PP9

Discourse: https://discourse.llvm.org/t/why-does-indvarsimplify-narrow-induction-variable-to-an-odd-sized-integer/86753

`linearFunctionTestReplace()` narrows the induction variable in the given example to an `i5`. But this can be problematic if the architecture doesn't support arbitrary sized integers.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 143386] [AArch64] Expanding reductions for scalable vectors is undefined.

2025-06-09 Thread LLVM Bugs via llvm-bugs


Issue

143386




Summary

[AArch64]  Expanding reductions for scalable vectors is undefined.




  Labels
  
backend:AArch64
  



  Assignees
  
  



  Reporter
  
  banach-space
  




**To reproduce:**
```bash
bin/llc  bin/llc -mtriple=aarch64-linux-gnu -mattr=+sve,+bf16 bug.ll
LLVM ERROR: Expanding reductions for scalable vectors is undefined.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
(...)
```

**Input IR**
```llvm
; ModuleID = 'bug.ll'
source_filename = "LLVMDialectModule"

; Function Attrs: nofree norecurse nosync nounwind memory(argmem: readwrite)
define { ptr, ptr, i64 } @kernel_sum_reduce(ptr readnone captures(none) %0, ptr readonly captures(none) %1, i64 %2, i64 %3, i64 %4, ptr readnone captures(none) %5, ptr readnone captures(none) %6, i64 %7, i64 %8, i64 %9, ptr readnone captures(none) %10, ptr readonly captures(none) %11, i64 %12, i64 %13, i64 %14, ptr readnone captures(none) %15, ptr readnone captures(none) %16, i64 %17, i64 %18, i64 %19, ptr readnone captures(none) %20, ptr readonly captures(none) %21, i64 %22, i64 %23, i64 %24, { [2 x i64], [5 x i64] } %25, ptr %26, ptr %27, i64 %28) local_unnamed_addr #2 {
  %30 = load bfloat, ptr %27, align 2
  %31 = load i64, ptr %1, align 4
  %32 = getelementptr inbounds nuw i8, ptr %1, i64 8
  %33 = load i64, ptr %32, align 4
  %34 = icmp slt i64 %31, %33
  br i1 %34, label %.lr.ph5, label %._crit_edge6

.lr.ph5:  ; preds = %29
  %35 = tail call i64 @llvm.vscale.i64()
  %36 = shl i64 %35, 1
 %.phi.trans.insert = getelementptr inbounds nuw i64, ptr %11, i64 %31
 %.pre = load i64, ptr %.phi.trans.insert, align 4
  br label %37

37: ; preds = %.lr.ph5, %._crit_edge
 %38 = phi i64 [ %.pre, %.lr.ph5 ], [ %43, %._crit_edge ]
  %39 = phi bfloat [ %30, %.lr.ph5 ], [ %57, %._crit_edge ]
  %40 = phi i64 [ %31, %.lr.ph5 ], [ %41, %._crit_edge ]
  %41 = add nsw i64 %40, 1
  %42 = getelementptr inbounds nuw i64, ptr %11, i64 %41
  %43 = load i64, ptr %42, align 4
  %44 = insertelement  zeroinitializer, bfloat %39, i64 0
 %45 = icmp slt i64 %38, %43
  br i1 %45, label %.lr.ph, label %._crit_edge

.lr.ph:   ; preds = %37, %.lr.ph
  %46 = phi  [ %54, %.lr.ph ], [ %44, %37 ]
  %47 = phi i64 [ %55, %.lr.ph ], [ %38, %37 ]
  %48 = sub i64 %43, %47
 %49 = tail call i64 @llvm.smin.i64(i64 %36, i64 %48)
  %50 = tail call  @llvm.aarch64.sve.whilelt.nxv2i1.i64(i64 0, i64 %49)
  %51 = getelementptr bfloat, ptr %21, i64 %47
  %52 = tail call  @llvm.masked.load.nxv2bf16.p0(ptr %51, i32 2,  %50,  zeroinitializer)
  %53 = fadd  %46, %52
  %54 = select  %50,  %53,  %46
  %55 = add i64 %47, %36
  %56 = icmp slt i64 %55, %43
  br i1 %56, label %.lr.ph, label %._crit_edge

._crit_edge: ; preds = %.lr.ph, %37
  %.lcssa = phi  [ %44, %37 ], [ %54, %.lr.ph ]
  %57 = tail call reassoc bfloat @llvm.vector.reduce.fadd.nxv2bf16(bfloat 0xR,  %.lcssa)
  %58 = icmp slt i64 %41, %33
  br i1 %58, label %37, label %._crit_edge6

._crit_edge6: ; preds = %._crit_edge, %29
  %.lcssa3 = phi bfloat [ %30, %29 ], [ %57, %._crit_edge ]
  %59 = insertvalue { ptr, ptr, i64 } poison, ptr %26, 0
 %60 = insertvalue { ptr, ptr, i64 } %59, ptr %27, 1
  %61 = insertvalue { ptr, ptr, i64 } %60, i64 %28, 2
  store bfloat %.lcssa3, ptr %27, align 2
 ret { ptr, ptr, i64 } %61
}


; Function Attrs: mustprogress nocallback nofree nosync nounwind speculatable willreturn memory(none)
declare bfloat @llvm.vector.reduce.fadd.nxv2bf16(bfloat, ) #5

; Function Attrs: mustprogress nocallback nofree nosync nounwind willreturn memory(none)
declare  @llvm.aarch64.sve.whilelt.nxv2i1.i64(i64, i64) #4

; Function Attrs: mustprogress nocallback nofree nosync nounwind willreturn memory(argmem: read)
declare  @llvm.masked.load.nxv2bf16.p0(ptr captures(none), i32 immarg, , ) #6

; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
declare i64 @llvm.smin.i64(i64, i64) #8

attributes #2 = { nofree norecurse nosync nounwind memory(argmem: readwrite) }
```



___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 143363] [DAG] Convert foldMaskedMerge to SDPatternMatch to match (m & x) | (~m & y)

2025-06-09 Thread LLVM Bugs via llvm-bugs


Issue

143363




Summary

[DAG] Convert foldMaskedMerge to SDPatternMatch to match (m & x) | (~m & y)




  Labels
  
good first issue,
llvm:SelectionDAG
  



  Assignees
  
  



  Reporter
  
  RKSimon
  




Now that foldMaskedMerge has been moved to DAGCombine in #137641, it can be refactored to use SDPatternMatch matchers.

This should allow us to remove foldMaskedMergeImpl entirely and not match all the commutation permutations separately but rely on the commutative SDPatternMatch m_Or/m_And/m_Not/m_Deferred matchers to find the pattern.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 143361] [BPF] `BTFDebug` doesn't generate BTF for all structs, if BPF map type is wrapped in Rust wrapper types

2025-06-09 Thread LLVM Bugs via llvm-bugs


Issue

143361




Summary

[BPF] `BTFDebug` doesn't generate BTF for all structs, if BPF map type is wrapped in Rust wrapper types




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  vadorovsky
  




This issue was detected during an attempt of supporting BTF maps in Aya (aya-rs/aya#1117).

BTF map definitions have the following format in C:

```c
struct my_key {
  int a;
};

struct my_value {
  int a;
};

struct {
  int (*type)[BPF_MAP_TYPE_HASH];
 typeof(struct my_key) *key;
  typeof(struct my_value) *value;
  int (*max_entries)[10];
} map_1 SEC(".maps");
```

The `map_1` instance is then used as `*void` in libbpf functions like `bpf_map_lookup_elem`, `bpf_map_update_elem` etc..

The key and value structs can be anything as long as they hold primitive/POD types and as long as they are aligned.

The program above produces the following BTF:

```
#0: 
#1:  --> [3]
#2:  'int' bits:32 off:0 enc:signed
#3:  n:1 idx-->[4] val-->[2]
#4:  '__ARRAY_SIZE_TYPE__' bits:32 off:0
#5:  --> [6]
#6:  'my_key' sz:4 n:1
#00 'a' off:0 --> [2]
#7:  --> [8]
#8:  'my_value' sz:4 n:1
#00 'a' off:0 --> [2]
#9:  --> [10]
#10:  n:10 idx-->[4] val-->[2]
#11:  '' sz:32 n:4
#00 'type' off:0 --> [1]
#01 'key' off:64 --> [5]
#02 'value' off:128 --> [7]
#03 'max_entries' off:192 --> [9]
#12:  'map_1' kind:global-alloc --> [11]
#13:  '.maps' sz:0 n:1
#00 off:0 sz:32 --> [12]
```

We can see both the map struct (`#11`) and the types used as key (`#6`) and value (`#8`).

However, in Rust, we want to wrap such map definitions in two wrapper types:

* A wrapper type representing a specific map type (e.g. `HashMap`, `RingBuf`), which provide methods (`get`, `update`), so people interact with those wrapper types instead of working with void pointers.
* Another wrapper type, which wraps the type above in `UnsafeCell`, so the Rust compiler doesn't complain about concurrent mutability and doesn't consider such action unsafe. It's basically a way of telling compiler, that we (Aya) guarantee that this type provides a thread-safe mutabiity (and it does out of the box, because of RCU in Linux kernel).

This ends up looking like:


```rust
#![no_std]
#![no_main]

pub const BPF_MAP_TYPE_HASH: usize = 1;

// The real map definition.
pub struct HashMapDef {
r#type: *const [i32; BPF_MAP_TYPE_HASH],
 key: *const K,
value: *const V,
max_entries: *const [i32; M],
 map_flags: *const [i32; F],
}
impl HashMapDef {
pub const fn new() -> Self {
Self {
 r#type: &[0i32; BPF_MAP_TYPE_HASH],
key: ::core::ptr::null(),
value: ::core::ptr::null(),
 max_entries: &[0i32; M],
map_flags: &[0i32; F],
}
 }
}
// Use `UnsafeCell` to allow the mutability by multiple threads.
pub struct HashMap(
 core::cell::UnsafeCell>,
);
impl HashMap {
pub const fn new() -> Self {
Self(core::cell::UnsafeCell::new(HashMapDef::new()))
 }
}
/// Tell Rust that `HashMap` is thread-safe.
unsafe impl Sync for HashMap {}

// Define custom structs for key and values.
pub struct MyKey(u32);
pub struct MyValue(u32);

#[link_section = ".maps"]
#[export_name = "HASH_MAP"]
pub static HASH_MAP: HashMap = HashMap::new();
```

However, for this Rust program, the BTF for `MyKey` and `MyValue` is `` and does not contain the actual `` definition.

The problem is reproducible only if the key and/or value type are custom structs.

I'm working on the fix, which is mostly ready, apart from llvm-lit test which will make sure it doesn't regress in the future.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 143360] [RISCV] -ffast-math issue on llvm 20.1.0

2025-06-09 Thread LLVM Bugs via llvm-bugs


Issue

143360




Summary

[RISCV] -ffast-math issue on llvm 20.1.0




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  amitch1999
  




Hi, 
Compiling this simple test, which is taken from gcc's regression, works on llvm 18.1.7, but gets a link error on llvm 20.1.0 (undefined reference to 'link_error')

Compiled using `clang test.c -O2 -ffast-math -lm`
On llvm 20 & 18, without the `-ffast-math` flag, compilation passes.
(my default march is `rv32imc`, but I tried if with a couple different march's(32/64, with/out `f`, with/out `d`) with the same result)
```
extern void link_error(void);

extern double atan(double);

int main()
{
  if (atan(1.0) < 0.78 || atan(1.0) > 0.79)
 link_error(); /* expected to be optimized out */

  return 0;
}
```

Is this expected?


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 143368] Missed deadcode elimination when using `phi`

2025-06-09 Thread LLVM Bugs via llvm-bugs


Issue

143368




Summary

Missed deadcode elimination when using `phi`




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  GINN-Imp
  




The following reduced IR is derived from https://github.com/clap-rs/clap/blob/1036060f1319412d3d50d821a7b39a0a0122f0f7/clap_builder/src/parser/parser.rs#L49

Godbolt: https://godbolt.org/z/aedE1YjfK
alive2 proof: https://alive2.llvm.org/ce/z/Df2FFT

opt-O3 works as expected when `phi` is replaced with simpler instructions.

```llvm
define void @_ZN12clap_builder6parser6parser6Parser16get_matches_with17h7b0a6fdc0204a479E(i64 %0, ptr %p) personality ptr null {
  %2 = xor i64 %0, 1
  switch i64 %2, label %3 [
i64 0, label %common.ret
i64 3, label %4
 ]

common.ret:
  ret void

3:
  br label %4

4:
  %.0 = phi i8 [ 0, %3 ], [ 1, %1 ]
  %cond2 = icmp eq i64 %0, 1
  br i1 %cond2, label %5, label %common.ret

5:
  store i8 %.0, ptr %p, align 1
  br label %common.ret
}
```

expected:
```llvm
define void @tgt(i64 %0, ptr %p) personality ptr null {
  ret void
}
```


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs


[llvm-bugs] [Bug 143379] [X86_64] Incorrect value of local variable after longjmp with optimizatio

2025-06-09 Thread LLVM Bugs via llvm-bugs


Issue

143379




Summary

[X86_64] Incorrect value of local variable after longjmp with optimizatio




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  hutuhutong
  




Summary

When compiling the following C code with optimization level -O1, the local variable x exhibits unexpected behavior. In principle, under optimization levels -O1, -O2, and -O3, Clang should treat the allocation and modification of the variable x in a consistent manner. However, when the statement printf("Address of x: %p\n", (void*)&x); is included, a different result is observed at -O1, indicating inconsistent optimization behavior. This suggests that Clang handles the optimization of the printf call involving the address of x in a non-uniform way.



Details

test.c
#include 
#include 
#include 


void test_gamma() {
int x=0;
jmp_buf env;
if (setjmp(env)) {
 printf("Jumped here\n");
} else {
x = 42;
 longjmp(env, 1);
}
printf("Value of x: %d   Address of x: %p\n", x, (void*)&x);
}

void test_gamma1() {
int x=0;
jmp_buf env;
 if (setjmp(env)) {
printf("Jumped here\n");
} else {
 x = 42;
longjmp(env, 1);
}
int a = x*1;
 printf("Value of x: %d %d\n", x, a);
}

int main() {
 test_gamma();
test_gamma1();
return 0;
}

output
$ clang -O0 test.c -o test
$ ./test
Jumped here
Value of x: 42   Address of x: 0x7ffc7979655c
Jumped here
Value of x: 42 42

$ clang -O1 test.c -o test
$ ./test
Jumped here
Value of x: 42 Address of x: 0x7ffd6dc09afc
Jumped here
Value of x: 0 0

$ clang -O2 test.c -o test
$ ./test
Jumped here
Value of x: 0   Address of x: 0x7ffceb8ec25c
Jumped here
Value of x: 0 0



___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs