date:20250116

[llvm-bugs] [Bug 123175] Mul reassociation in instcombine does not maintain NSW (for example impacting alias analysis negatively when canonicalizing GEP)

2025-01-16 Thread LLVM Bugs via llvm-bugs



Issue

123175




Summary

Mul reassociation in instcombine does not maintain NSW (for example impacting alias analysis negatively when canonicalizing GEP)




  Labels
  
llvm:instcombine
  



  Assignees
  
  



  Reporter
  
  bjope
  




Consider IR like this (also in godbolt here https://godbolt.org/z/rW8K8zfnc):

```
; RUN: opt -passes='aa-eval,instcombine,aa-eval' -print-all-alias-modref-info 

target datalayout = "p:16:16:16:16"

define i16 @foo1(i16 %x) {
 %a = mul nsw nuw i16 %x, 2
%b = mul nsw nuw i16 %a, 3
ret i16 %b
}

define i16 @foo2(i16 %x) {
%a = mul nsw nuw i16 %x, 3
%b = mul nsw nuw i16 %a, 2
ret i16 %b
}

define ptr @foo3(i16 noundef %x, ptr noundef %p) {
%cmp = icmp sgt i16 %x, 0
call void @llvm.assume(i1 %cmp)
%a = mul nsw nuw i16 %x, 3
%idxprom = sext i16 %a to i64
%b = getelementptr inbounds i16, ptr %p, i64 %idxprom
 store i16 2, ptr %b
store i16 1, ptr %p
ret ptr %b
}
```

It seems a bit inconsistent that instcombine for foo1 is able to keep "nsw nuw" on the simplified mul
```
%b = mul nuw nsw i16 %x, 6
```
while for foo2 nsw is dropped
```
%b = mul nuw i16 %x, 6
```
and for foo3 both nuw and nsw is dropped on the mul
```
  %b.idx = mul i16 %x, 6
  %b = getelementptr inbounds i8, ptr %p, i16 %b.idx
```

The foo3 example also show that dropping "nsw" on the mul may impact alias analysis as it no longer is able to derive NoAlias after instcombine.


I think one problem here is that InstCombinerImpl::SimplifyAssociativeOrCommutative only deal with Add/Sub when using the maintainNoSignedWrap helper.

Here (https://alive2.llvm.org/ce/z/RqG2pz) is an alive2 proof showing that we at least should be able to keep nsw on the second mul when doing "(A mul B) mul C" ==> "A mul (B mul C)", as long as the associated Mul operations are both "nsw nuw":
```
define i8 @src(i8 %a, i8 %b, i8 %c) {
%x = mul nsw nuw i8 %a, %b
%y = mul nsw nuw i8 %x, %c
ret i8 %y
}

define i8 @tgt(i8 %a, i8 %b, i8 %c) {
%x = mul i8 %c, %b
%y = mul nsw nuw i8 %x, %a
ret i8 %y
}
```

Maybe there are more situations when "nsw" can be kept given that (B mul C) simplifies, e.g. when all involved values are known to be non-negative?

PS. When using the reassociate pass instead of instcombine the result is even worse, since it drops both "nuw nsw" even for foo1.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 123179] [clang-format] Macro formatting regression 19.1.6 vs 19.1.7

2025-01-16 Thread LLVM Bugs via llvm-bugs



Issue

123179




Summary

[clang-format] Macro formatting regression 19.1.6 vs 19.1.7




  Labels
  
clang-format
  



  Assignees
  
  



  Reporter
  
  chegoryu
  




For the following code:
```cpp
template
void write_to(Writer& writer, const FieldHeader& field_header) {
#define WRITE_MESSAGE(type) \
 { \
case FieldType::type: { \
  writer.value(#type); \
 writer.key("Message").start_object(); \
  write_to(writer, cast_to(field_header)); \
  writer.finish_object(); \
 return; \
} \
  }
}
```

```
Ubuntu clang-format version 19.1.7 (++20250114103238+cd708029e0b2-1~exp1~20250114103342.77)
```

Produces
```cpp
template
void write_to(Writer& writer, const FieldHeader& field_header) {
#define WRITE_MESSAGE(type) \
  {case FieldType::type: {writer.value(#type); \
  writer.key("Message").start_object(); \
 write_to(writer, cast_to(field_header)); \
  writer.finish_object(); \
  return; \
  } \
  }
}
```

But
```
Ubuntu clang-format version 19.1.6 (++20241217110052+657e03f8625c-1~exp1~20241217110110.73)
```
Does not change the file


`clang-format-19 file.cpp --dump-config`: https://pastebin.com/7vebyX6j


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 123189] [MASM] SIGSEGV in `checkForValidSection` in MasmParser

2025-01-16 Thread LLVM Bugs via llvm-bugs



Issue

123189




Summary

[MASM] SIGSEGV in `checkForValidSection` in MasmParser




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  MisterDA
  




I'm trying to cross-compile the OCaml compiler with a Debian host, targeting `x86_64-pc-windows` with `clang-cl`. I'm running into a segfault from `llvm-ml` (the MASM assembler), a drop-in replacement for Microsoft's `ml64`. I hit the issue with LLVM 18 and LLVM 20 (ea14bdb0356cdda727ac032470f6a0a2102d1281 as the time of writing). Here is a reproducer, as a Dockerfile (build with `docker build --platform linux/amd64 .`), and the backtrace:

```Dockerfile
# syntax=docker/dockerfile:1
FROM debian:experimental
ARG LLVM_VERSION=20

ENV DEBUGINFOD_URLS="https://debuginfod.debian.net"
RUN cat <<'EOF' > /etc/apt/sources.list.d/debug.list
deb http://deb.debian.org/debian-debug/ experimental-debug main
EOF

RUN rm -f /etc/apt/apt.conf.d/docker-clean; echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' > /etc/apt/apt.conf.d/keep-cache
RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
 --mount=type=cache,target=/var/lib/apt,sharing=locked \
apt update && DEBIAN_FRONTEND=noninteractive apt upgrade -y && \
 DEBIAN_FRONTEND=noninteractive apt-get --no-install-recommends install -y \
clang-$LLVM_VERSION clang-$LLVM_VERSION-dbgsym \
 clang-tools-$LLVM_VERSION clang-tools-$LLVM_VERSION-dbgsym \
 lld-$LLVM_VERSION lld-$LLVM_VERSION-dbgsym \
llvm-$LLVM_VERSION llvm-$LLVM_VERSION-dbgsym \
lldb-$LLVM_VERSION lldb-$LLVM_VERSION-dbgsym \
make gdb
ADD --keep-git-dir --link https://github.com/ocaml/ocaml.git /root/ocaml
WORKDIR /root/ocaml

ENV LLVM_SYMBOLIZER_PATH=/usr/bin/llvm-symbolizer-$LLVM_VERSION

RUN clang-cl-20 -nologo -EP -TC runtime/caml/domain_state.tbl > runtime/domain_state.inc

# llvm-ml-20 -m64 dislikes parentheses on macro calls
RUN sed -e 's/(//g' -e 's/)//g' -i runtime/domain_state.inc

# llvm-ml-20 doesn't understand NEAR
RUN sed -E -e 's/(EXTRN.*):.*NEAR/\1:PROC/g' -i runtime/amd64nt.asm

RUN llvm-ml-20 -m64 -nologo -Iruntime -c -Foruntime/amd64nt.obj runtime/amd64nt.asm
```

```
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.  Program arguments: llvm-ml-20 -m64 -nologo -Iruntime -c -Foruntime/amd64nt.obj runtime/amd64nt.asm
 #0 0x77fa117a llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) build-llvm/tools/clang/stage2-bins/llvm/lib/Support/Unix/Signals.inc:723:13
 #1 0x77f9ed14 llvm::sys::RunSignalHandlers() build-llvm/tools/clang/stage2-bins/llvm/lib/Support/Signals.cpp:106:18
 #2 0x77fa182b SignalHandler build-llvm/tools/clang/stage2-bins/llvm/lib/Support/Unix/Signals.inc:413:1
 #3 0x76ac0da0 (/lib/x86_64-linux-gnu/libc.so.6+0x3fda0)
 #4 0x798923a2 checkForValidSection build-llvm/tools/clang/stage2-bins/llvm/lib/MC/MCParser/MasmParser.cpp:1457:31
 #5 0x79895133 parseStatement build-llvm/tools/clang/stage2-bins/llvm/lib/MC/MCParser/MasmParser.cpp:0:7
 #6 0x7988d2a5 Run build-llvm/tools/clang/stage2-bins/llvm/lib/MC/MCParser/MasmParser.cpp:0:0
 #7 0xd0f0 AssembleInput build-llvm/tools/clang/stage2-bins/llvm/tools/llvm-ml/llvm-ml.cpp:186:13
 #8 0xbc9a llvm_ml_main build-llvm/tools/clang/stage2-bins/llvm/tools/llvm-ml/llvm-ml.cpp:0:11
 #9 0xe45a main build-llvm/tools/clang/stage2-bins/build-llvm/tools/clang/stage2-bins/tools/llvm-ml/llvm-ml-driver.cpp:17:10
#10 0x76aaad68 __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:74:3
#11 0x76aaae25 call_init ./csu/../csu/libc-start.c:128:20
#12 0x76aaae25 __libc_start_main ./csu/../csu/libc-start.c:347:5
#13 0x9d71 (/usr/lib/llvm-20/bin/llvm-ml+0x5d71)
Segmentation fault
```

https://github.com/llvm/llvm-project/blob/628976c8345e235d4f71a0715f1990ad8b5bbcf7/llvm/lib/MC/MCParser/MasmParser.cpp#L1456-L1463

Presumably `getStreamer()` returns a `nullptr`.

It's possibly similar to #97635, I'll ping the participants: @sivan-shani @MaskRay.
Thanks for any help!


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 123195] PointerIntPair.h:172:17: error: static assertion failed due to requirement

2025-01-16 Thread LLVM Bugs via llvm-bugs



Issue

123195




Summary

PointerIntPair.h:172:17: error: static assertion failed due to requirement




  Labels
  
build-problem,
mlir
  



  Assignees
  
  



  Reporter
  
  sylvestre
  




Recent regression on linux:

https://llvm-jenkins.debian.net/job/llvm-toolchain-bookworm-binaries/architecture=i386,distribution=bookworm,label=i386/1114/consoleFull```
In file included from /build/source/mlir/lib/Bytecode/Writer/BytecodeWriter.cpp:9:
In file included from /build/source/mlir/include/mlir/Bytecode/BytecodeWriter.h:16:
In file included from /build/source/mlir/include/mlir/IR/AsmState.h:18:
In file included from /build/source/mlir/include/mlir/IR/OperationSupport.h:17:
In file included from /build/source/mlir/include/mlir/IR/Attributes.h:12:
In file included from /build/source/mlir/include/mlir/IR/AttributeSupport.h:17:
In file included from /build/source/mlir/include/mlir/IR/StorageUniquerSupport.h:21:
In file included from /build/source/llvm/include/llvm/ADT/FunctionExtras.h:35:
/build/source/llvm/include/llvm/ADT/PointerIntPair.h:172:17: error: static assertion failed due to requirement '3U <= PointerUnionUIntTraits::NumLowBitsAvailable': PointerIntPair with integer size too large for pointer
  172 |   static_assert(IntBits <= PtrTraits::NumLowBitsAvailable,
  | ^
/build/source/llvm/include/llvm/ADT/PointerIntPair.h:111:13: note: in instantiation of template class 'llvm::PointerIntPairInfo>' requested here
  111 | Value = Info::updateInt(Info::updatePointer(0, PtrVal),
```


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 123198] False positive in bugprone-string-constructor

2025-01-16 Thread LLVM Bugs via llvm-bugs



Issue

123198




Summary

False positive in bugprone-string-constructor




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  JVApen
  





#include 

void f(std::string str)
{
  // Find the substring "FAMILY:" (copied from old code so still using C-style Char pointers)
  const char *ptr = str.c_str();
 std::string copy(ptr, 0, str.size()/2);
}

On Compiler Explorer: https://compiler-explorer.com/z/1o86onvGG

The result:

[:7:19: warning: constructor creating an empty string [bugprone-string-constructor]]
   7 |   std::string copy(ptr, 0, str.size()/2);
  |   ^
1 warning generated.


In this example, the following constructor of std::string should be called:

template< class StringViewLike >
basic_string( const StringViewLike& t,
  size_type pos, size_type count,
 const Allocator& alloc = Allocator() );


Since we provide a valid pointer and a valid non-0 count, the string isn't empty by default.
As such, the warning should not be given.



___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 123203] [Clang][OpenCL] Compiler crash on __builtin_assume_aligned in OpenCL

2025-01-16 Thread LLVM Bugs via llvm-bugs



Issue

123203




Summary

[Clang][OpenCL] Compiler crash on __builtin_assume_aligned in OpenCL




  Labels
  
clang,
OpenCL,
crash
  



  Assignees
  
  



  Reporter
  
  ritter-x2a
  




Using the return value of `__builtin_assume_aligned` in OpenCL hits an assertion in clang. I don't have a strong opinion on whether the builtin should be supported in OpenCL since it's not part of the Khronos spec, but it shouldn't hit an assertion.
Observed with a RelWithDebInfo trunk build, on Ubuntu 22.04.

Reproducer:
```c
void f(__global int *g) {
  __global int *ag = __builtin_assume_aligned(g, 16);
}
```
When compiling this via `clang -c test.cl`, clang first reports unexpected diagnostics (I don't see how `bool`s are involved) and then hits an assertion:
```
test.cl:2:17: error: incompatible integer to pointer conversion initializing '__global int *__private' with an _expression_ of type 'bool' [-Wint-conversion]
2 | __global int *ag = __builtin_assume_aligned(g, 16);
  | ^ ~~
clang: /home/faritter/playground/llvm/llvm-project/llvm/lib/IR/Instructions.cpp:2974: static llvm::CastInst* llvm::CastInst::Create(llvm::Instruction::CastOps, llvm::Value*, llvm::Type*, const llvm::Twine&, llvm::InsertPosition): Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: ./build/bin/clang -c test.cl
1.	 parser at end of file
2.	test.cl:1:6: LLVM IR generation of declaration 'f'
3.	test.cl:1:6: Generating code for declaration 'f'
 #0 0x56e464acc7ff llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/faritter/playground/llvm/llvm-project/llvm/lib/Support/Unix/Signals.inc:802:3
[...]
#13 0x56e4636a9852 llvm::IRBuilderBase::CreateCast(llvm::Instruction::CastOps, llvm::Value*, llvm::Type*, llvm::Twine const&, llvm::MDNode*, llvm::FMFSource) /home/faritter/playground/llvm/llvm-project/llvm/include/llvm/IR/IRBuilder.h:2193:0
#14 0x56e4636a9852 llvm::IRBuilderBase::CreateIntCast(llvm::Value*, llvm::Type*, bool, llvm::Twine const&) /home/faritter/playground/llvm/llvm-project/llvm/include/llvm/IR/IRBuilder.h:2231:0
#15 0x56e464e2f677 (anonymous namespace)::ScalarExprEmitter::VisitCastExpr(clang::CastExpr*) /home/faritter/playground/llvm/llvm-project/clang/lib/CodeGen/CGExprScalar.cpp:2574:44
#16 0x56e464e2c104 Visit /home/faritter/playground/llvm/llvm-project/clang/lib/CodeGen/CGExprScalar.cpp:449:3
#17 0x56e464e2c104 clang::CodeGen::CodeGenFunction::EmitScalarExpr(clang::Expr const*, bool) /home/faritter/playground/llvm/llvm-project/clang/lib/CodeGen/CGExprScalar.cpp:5591:13
[...]
```

The full backtrace, preprocessed source, and run script are attached:

[backtrace.txt](https://github.com/user-attachments/files/18440429/backtrace.txt)

[test-ff43fa.cl.txt](https://github.com/user-attachments/files/18440433/test-ff43fa.cl.txt)

[test-ff43fa.sh.txt](https://github.com/user-attachments/files/18440434/test-ff43fa.sh.txt)


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 123224] [libc++] regression: new/delete symbol overrides broken on macOS

2025-01-16 Thread LLVM Bugs via llvm-bugs



Issue

123224




Summary

[libc++] regression: new/delete symbol overrides broken on macOS




  Labels
  
libc++
  



  Assignees
  
  



  Reporter
  
  tycho
  




This is referring to commit https://github.com/llvm/llvm-project/commit/841895543edcf98bd16027c6b85fe7c6419a4566.

In a shared library which statically links libc++ (ANGLE's libEGL in this case), the symbols for `new` and `new[]` are, as of the above commit, suddenly exposed as global, but the corresponding `delete` and `delete[]` operators are not.

Before the above commit:
```
$ nm -g -C --defined-only contrib/angle/angle/out/macOS-Debug-arm64/libEGL.dylib | grep -e new -e delete
```

After:
```
$ nm -g -C --defined-only contrib/angle/angle/out/macOS-Debug-arm64/libEGL.dylib | grep -e new -e delete
001e9460 T operator new[](unsigned long)
001e9740 T operator new[](unsigned long, std::align_val_t)
001e9328 T operator new(unsigned long)
001e95e0 T operator new(unsigned long, std::align_val_t)
```

This causes applications like mine with custom allocators (mimalloc in this case) to provide the implementations for the operator `new` symbols, but not the `delete` symbols, which inevitably causes a crash within the shared library when it tries to free memory.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 123212] [AMDGPU][GISel] Missing (or not running) combine for `sra workitem.id.xx, 31`

2025-01-16 Thread LLVM Bugs via llvm-bugs



Issue

123212




Summary

[AMDGPU][GISel] Missing (or not running) combine for `sra workitem.id.xx, 31`




  Labels
  
  



  Assignees
  
  



  Reporter
  
  qcolombet
  




In the AMDGPU backend, GISel ends up with additional instructions because we are missing some simplification that could take advantage of the range of the `workitem.id.xx` values.

I am somewhat surprised because I see that the AMDGPU backend implements the `TargetLowering::computeKnownBitsForTargetInstr` method and has some logic to propagate the known bits for these intrinsics.
Bottom line, I haven't dug into why the simplification doesn't happen, that may be an easy fix.

Anyhow, the issue at hand is that `sra workitem.id.xx, 31` could be simplified in `shl workitem.id.xx, 31` and then further simplified in a plain `0`.

# To Reproduce #

Download the attached reproducer or copy/paste the LLVM IR at the end of this issue.
[repro.ll.txt](https://github.com/user-attachments/files/18441382/repro.ll.txt)
Then run:
```bash
llc -O3 -march=amdgcn -mcpu=gfx942  -mtriple amdgcn-amd-hmcsa -global-isel=<0|1> reduced.ll -o - 
```

# Result #

With GISel we have a `sra` and `xor` in the final assembly, whereas they could be eliminated.

With GISel:
```asm
	s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
	v_and_b32_e32 v2, 0x3ff, v31
	v_ashrrev_i32_e32 v3, 31, v2
	v_xor_b32_e32 v2, v3, v2
	flat_store_dword v[0:1], v2
	s_waitcnt vmcnt(0) lgkmcnt(0)
	s_setpc_b64 s[30:31]
```

With SDISel:
```asm
	s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
	v_and_b32_e32 v2, 0x3ff, v31
	flat_store_dword v[0:1], v2
	s_waitcnt vmcnt(0) lgkmcnt(0)
	s_setpc_b64 s[30:31]
```

# Note #
Input LLVM IR:
```llvm
target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-p7:160:256:256:32-p8:128:128-p9:192:256:256:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7:8:9"
target triple = "amdgcn-amd-amdhsa"

declare noundef i32 @llvm.amdgcn.workgroup.id.x()

define dso_local void @foo.bb.split(ptr %out) {
newFuncRoot:
  %i = tail call i32 @llvm.amdgcn.workitem.id.x()
 %.lobit = ashr i32 %i, 31
  %i32 = xor i32 %.lobit, %i
  store i32 %i32, ptr %out
  ret void
}
```


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 123208] AMDGPU silently converts incorrect physical register asm constraint to virtual register

2025-01-16 Thread LLVM Bugs via llvm-bugs



Issue

123208




Summary

AMDGPU silently converts incorrect physical register asm constraint to virtual register




  Labels
  
backend:AMDGPU,
accepts-invalid
  



  Assignees
  
  



  Reporter
  
  arsenm
  




```
; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx940 < %s

define void @invalid_sgpr(<2 x i32> inreg %arg0) {
  call void asm sideeffect "; use $0", "{s[1:2]}"(<2 x i32> %arg0)
  ret void
}

```

s[1:2] is not a valid SGPR reference as 64-bit SGPRs require even alignment. This is silently accepted, and appears to be treated as a virtual register constraint. In -stop-after=finalize-isel, I see:

```
%10:sreg_64 = COPY %11
 INLINEASM &"; use $0", 1 /* sideeffect attdialect */, 3997705 /* reguse:SReg_64 */, %10
```


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 123214] [flang] The Fortran test cases for hdf5-1.10.6 cannot be built with Flang

2025-01-16 Thread LLVM Bugs via llvm-bugs



Issue

123214




Summary

[flang] The Fortran test cases for hdf5-1.10.6 cannot be built with Flang




  Labels
  
flang
  



  Assignees
  
  



  Reporter
  
  pawosm-arm
  




Although the hdf5 library can be built and installed when tests are explicitly disabled (with `--disable-tests` passed to the `configure` script), this is not an optimal situation.

I'm configuring hdf5-1.10.6 as such:

```
$ CC=mpicc CXX=mpic++ FC=mpifort F77=mpifort F90=mpifort ./configure --enable-shared --enable-static --enable-parallel --disable-cxx --enable-fortran --enable-hl --prefix=/some/prefix
$ sed -i -e 's#wl=""#wl="-Wl,"#g' libtool
$ sed -i -e 's#pic_flag=""#pic_flag=" -fPIC -DPIC"#g' libtool
```
(the `sed` lines are here to make it able to build shared libs, it's a known flang issue)

Unfortunately, this will fail when building Fortran tests as such:

```
mpifort  -I. -I../../../fortran/test -I../../src -I../../fortran/src -I../../fortran/src -I../../fortran/src  -c -o tH5T.o ../../../fortran/test/tH5T.F90

error: Semantic errors in ../../../fortran/test/tH5T.F90
./../../../fortran/test/tH5T.F90:283:6: error: No specific subroutine of generic 'h5dwrite_f' matches the actual arguments
   CALL h5dwrite_f(dset_id, dt4_id, real_member, data_dims, error, xfer_prp = plist_id)
 
./../../../fortran/test/tH5T.F90:541:6: error: No specific subroutine of generic 'h5dread_f' matches the actual arguments
   CALL h5dread_f(dset_id, dt4_id, real_member_out, data_dims, error)
 ^^
./../../../fortran/test/tH5T.F90:544:9: error: Cannot use intrinsic function 'verify' as a subroutine
  CALL VERIFY("h5dread_f:Wrong double precision data is read back", real_member_out(i), real_member(i), total_error)
 ^^
./../../../fortran/test/tH5T.F90:544:9: error: No specific subroutine of generic 'verify' matches the actual arguments
  CALL VERIFY("h5dread_f:Wrong double precision data is read back", real_member_out(i), real_member(i), total_error)
 ^^
```



___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 123201] [LV][EVL] Support interleaved accesses for EVL tail folding.

2025-01-16 Thread LLVM Bugs via llvm-bugs



Issue

123201




Summary

[LV][EVL] Support interleaved accesses for EVL tail folding.




  Labels
  
vectorizers
  



  Assignees
  
Mel-Chen
  



  Reporter
  
  Mel-Chen
  




The motivation for this issue is to provide better support for RVV unit-strided segment load/store. 
The following scenarios need to be supported: 
* Interleaved load (vp.load + interleave)
* Interleaved load with tail gaps (Requires scalar epilogue to run the last iteration)
* Fully interleaved store (deinterleave + vp.store)
* Interleaved store with gaps (This can not emit unit-strided segment store. We can only emit a wide masked store for that)

Due to the high complexity of `VPInterleaveRecipe::execute()`, creating a new recipe or converting it into `VPWidenIntrinsicRecipe` does not seem like a wise approach.
A tentative approach I have in mind is to first split `VPInterleaveRecipe` into `VPWidenLoad + VPDeinterleave` and `VPInterleave + VPWidenStore`. During the EVL lowering phase, we would only need to transform `VPWidenLoad/VPWidenStore` into `VPWidenLoadEVL/VPWidenStoreEVL`.
For now, the focus will be on supporting factor 2 (`interleave2/deinterleave2`) as the initial target, with support for factors 3 to 8 planned after  test results are stable.
Related IAP support: #120490 .


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 123227] Clang static analyzer false positive suppression does not suppress an issue report

2025-01-16 Thread LLVM Bugs via llvm-bugs



Issue

123227




Summary

Clang static analyzer false positive suppression does not suppress an issue report




  Labels
  
clang,
false-positive
  



  Assignees
  
  



  Reporter
  
  SergeySatskiy
  




We use the clang static analyzer for our C++ code as a part of a workflow. Sometimes there are false positives and I have troubles to suppress them. 
Here is an example of the code:

```c++
template
inline
typename CParam::TValueType
CParam::Get(void) const
{
 if ( !m_ValueSet ) {
// The lock prevents multiple initializations with the default value
// in Get(), but does not prevent Set() from modifying the value
// while another thread is reading it.
 CMutexGuard guard(s_GetLock());
if ( !m_ValueSet ) {
 m_Value = GetThreadDefault();
if (GetState() >= eState_Config) {
// All sources checked or the value is set by user.
 m_ValueSet = true;
}
}
}
return m_Value;
}
```
An issue is reported for the
```return m_Value;```
line as follows: "Undefined or garbage value returned to caller".
The developer of the code investigated this case and it seems that the false positive is because the multithreaded nature of the code was not taken into consideration. It is understandable so I tried to suppress the issue reporting. Following the documentation I tried multiple options (adding before the ```return ...``` line):
- ```__attribute__((suppress))```
- ```[[clang::suppress]]```
- ```[[gsl::suppress("lifetime")]]```
- ```[[gsl::suppress("bounds")]]```
And none of this options suppressed the issue reporting.

Do I do something wrong or there is an issue with the clang analyzer so that the suppress attribute is not taken into consideration?

Note: the code is compiled with ```-std=gnu++17``` option. I tried ```-std=c++17``` option as well with the same outcome.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 123296] Clang crash when trying to evaluate a constexpr with `auto` type, variadic template before type is fully defined

2025-01-16 Thread LLVM Bugs via llvm-bugs



Issue

123296




Summary

Clang crash when trying to evaluate a constexpr with `auto` type, variadic template before type is fully defined




  Labels
  
clang
  



  Assignees
  
  



  Reporter
  
  bricknerb
  




Example:

```c++
struct MyClass {
  template  static constexpr auto foo() { return 1;}
 static constexpr auto my_value = foo();
};
```

We get a "Unexpected undeduced type!" crash 

Compiler Explorer: https://godbolt.org/z/7rhT7qEeE

I believe this should be an error that says the function `foo()` isn't defined yet because the class isn't fully defined, but not a crash.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 123278] Missed optimization between `range` parameter metadata and `assume`s

2025-01-16 Thread LLVM Bugs via llvm-bugs



Issue

123278




Summary

Missed optimization between `range` parameter metadata and `assume`s




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  scottmcm
  




Now that `range` parameter metadata exists (🎉) I'm trying to remove some of our `assume`s that `rustc` outputs which should no longer be necessary.

That works for almost all of our tests in https://github.com/rust-lang/rust/blob/master/tests/codegen/transmute-optimized.rs , but one.

We end up, after optimizations, still getting 
```rust
define noundef range(i8 1, 4) i8 @ordering_transmute_onetwothree(i8 noundef returned range(i8 -1, 2) %x) unnamed_addr #2 {
start:
  %0 = icmp ne i8 %x, 0
  tail call void @llvm.assume(i1 %0)
  %1 = icmp ult i8 %x, 4
  tail call void @llvm.assume(i1 %1)
  ret i8 %x
}
```

That input range is `[-1, 2)` and those assumes are a range `[1, 4)`, so it ought to simplify to just `ret i8 1`, but it doesn't.

Alive2 proof that it would be legal:  -- and legal even without the `range` on the return value.

Maybe this is somehow related to the `x uge 1` being turned into `x ne 0`, and thus it not noticing there's a range?  Or maybe it's something about the wrap-around?

SEO: rust transmute range bounds

---

As an aside, I'd love to emit these as `range` [assume operand bundles](https://llvm.org/docs/LangRef.html#assume-operand-bundles) instead of `icmp`s, but AFAICT those don't exist yet, so I'm stuck with the `icmp`s for now 🙁 



___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 123291] Conversion of Affine Loops to GPU Dialect Fails with 'Invalid Dimension or Symbol Identifier'

2025-01-16 Thread LLVM Bugs via llvm-bugs



Issue

123291




Summary

Conversion of Affine Loops to GPU Dialect Fails with 'Invalid Dimension or Symbol Identifier'




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  lhw414
  




## Description

While attempting to convert an affine loop-based MLIR to the GPU dialect using the `convert-affine-for-to-gpu` pass, I encounter the following error:

```plaintext
sample.mlir:8:14: error: 'affine.load' op index must be a valid dimension or symbol identifier
%0 = affine.load %arg0[%arg1, %arg2] : memref<6x12xf32>
 ^
sample.mlir:8:14: note: see current operation: %8 = "affine.load"(%arg0, %7, %arg13) <{map = affine_map<(d0, d1) -> (d0, d1)>}> : (memref<6x12xf32>, index, index) -> f32
```

The mlir-opt command used to reproduce this issue is:

```bash
../build/bin/mlir-opt sample.mlir -o sample_output.mlir \
-pass-pipeline="builtin.module(func.func(convert-affine-for-to-gpu{gpu-block-dims=1 gpu-thread-dims=0}))"
```

Here is the original input MLIR:

```mlir
module {
  memref.global "private" @global_seed : memref = dense<0>
  func.func @main(%arg0: memref<6x12xf32>) -> memref<6x12xf32> {
%cst = arith.constant 0.00e+00 : f32
%alloc = memref.alloc() {alignment = 64 : i64} : memref<6x12xf32>
affine.for %arg1 = 0 to 6 {
  affine.for %arg2 = 0 to 12 {
%0 = affine.load %arg0[%arg1, %arg2] : memref<6x12xf32>
  }
}
 return %alloc : memref<6x12xf32>
  }
}
```

The resulting intermediate IR dump after the failure is as follows:

```mlir
// -// IR Dump After ConvertAffineForToGPU Failed (convert-affine-for-to-gpu) //-
"func.func"() <{function_type = (memref<6x12xf32>) -> memref<6x12xf32>, sym_name = "main"}> ({
^bb0(%arg0: memref<6x12xf32>):
 %0 = "arith.constant"() <{value = 0.00e+00 : f32}> : () -> f32
  %1 = "memref.alloc"() <{alignment = 64 : i64, operandSegmentSizes = array}> : () -> memref<6x12xf32>
  %2 = "arith.constant"() <{value = 0 : index}> : () -> index
  %3 = "arith.constant"() <{value = 6 : index}> : () -> index
  %4 = "arith.subi"(%3, %2) <{overflowFlags = #arith.overflow}> : (index, index) -> index
  %5 = "arith.constant"() <{value = 1 : index}> : () -> index
  %6 = "arith.constant"() <{value = 1 : index}> : () -> index
 "gpu.launch"(%4, %6, %6, %6, %6, %6) <{operandSegmentSizes = array}> ({
  ^bb0(%arg1: index, %arg2: index, %arg3: index, %arg4: index, %arg5: index, %arg6: index, %arg7: index, %arg8: index, %arg9: index, %arg10: index, %arg11: index, %arg12: index):
%7 = "arith.addi"(%2, %arg1) <{overflowFlags = #arith.overflow}> : (index, index) -> index
"affine.for"() <{lowerBoundMap = affine_map<() -> (0)>, operandSegmentSizes = array, step = 1 : index, upperBoundMap = affine_map<() -> (12)>}> ({
^bb0(%arg13: index):
 %8 = "affine.load"(%arg0, %7, %arg13) <{map = affine_map<(d0, d1) -> (d0, d1)>}> : (memref<6x12xf32>, index, index) -> f32
  "affine.yield"() : () -> ()
}) : () -> ()
"gpu.terminator"() : () -> ()
  }) {workgroup_attributions = 0 : i64} : (index, index, index, index, index, index) -> ()
  "func.return"(%1) : (memref<6x12xf32>) -> ()
}) : () -> ()
```

## Questions

1. Is there an issue with the original MLIR input?

- Are there any preconditions or required passes that I missed before applying convert-affine-for-to-gpu?

2. Could this be a problem in the convert-affine-for-to-gpu implementation?

- The error suggests that the indices for affine.load are not considered valid dimensions or symbols. However, %arg1 and %arg2 are induction variables of affine.for loops, which are typically valid.

3. What additional passes should be applied before convert-affine-for-to-gpu?

- For instance, should I run canonicalize, lower-affine, or similar passes to simplify the IR and ensure compatibility?

4. Are there any reference backends or examples for converting MLIR with linalg or affine dialects to the GPU dialect?

- I am particularly interested in examples or documentation that describe the process and highlight best practices for such conversions.



___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 123246] [mlir] -remove-dead-values crashes on scf.if for empty region

2025-01-16 Thread LLVM Bugs via llvm-bugs



Issue

123246




Summary

[mlir] -remove-dead-values crashes on scf.if for empty region




  Labels
  
mlir,
crash
  



  Assignees
  
  



  Reporter
  
  python3kgae
  




reproduce with: mlir-opt -remove-dead-values a.mlir

a.mlir:
```
func.func @nested_if(%cond0: i1, %cond1: i1, %cond2: i1, %p: memref<1xf32>) {
  %cst = arith.constant 1.00e+00 : f32
  scf.if %cond0 {
  } else {
scf.if %cond1 {
} else {
  scf.if %cond2 {
affine.store %cst, %p[0] : memref<1xf32>
  }
}
  }
  return
}
```

It will crash in cleanRegionBranchOp when access region.front() or region.back().

And could be worked around by adding
```
if (region.empty())
 continue;
```
to all these access.

There's assert in one of the access:
```
assert(!region.empty() && "expected a non-empty region in an op "
  "implementing `RegionBranchOpInterface`");
```

But scf.if seems OK to return an empty region for getElseRegion.




___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 123263] `@llvm.minimumnum.f32` returns sNaN instead of qNaN on x86_64

2025-01-16 Thread LLVM Bugs via llvm-bugs



Issue

123263




Summary

`@llvm.minimumnum.f32` returns sNaN instead of qNaN on x86_64




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  sunfishcode
  




LLVM's documentation for `@llvm.minimumnum.f32` [says](https://llvm.org/docs/LangRef.html#llvm-minimumnum-intrinsic) "If both operands are NaNs (including sNaN), returns qNaN". However, on x86_64, it actually returns sNaN.

Specifically, with this test.c:
```c
#include 
#include 

float f32_minnumber(float x, float y);

int main() {
float f = __builtin_nansf("");
float g = f32_minnumber(f, f);
float h = g + 0;

unsigned uf, ug, uh;
 memcpy(&uf, &f, sizeof(f));
memcpy(&ug, &g, sizeof(f));
memcpy(&uh, &h, sizeof(f));

printf("%x\n%x\n%x\n", uf, ug, uh);
return 0;
}
```
and this minnumber.ll:
```llvmir
target triple = "x86_64-pc-linux-gnu"

define float @f32_minnumber(float %x, float %y) {
  %t = call float @llvm.minimumnum.f32(float %x, float %y)
  ret float %t
}
define double @f64_minnumber(double %x, double %y) {
  %t = call double @llvm.minimumnum.f32(double %x, double %y)
  ret double %t
}
define float @f32_maxnumber(float %x, float %y) {
  %t = call float @llvm.maximumnum.f32(float %x, float %y)
  ret float %t
}
define double @f64_maxnumber(double %x, double %y) {
  %t = call double @llvm.maximumnum.f32(double %x, double %y)
  ret double %t
}
```

Compiling for x86_64 gets this output:
```console
$ clang test.c minnumber.ll 
$ ./a.out 
7fa0
7fa0
7fe0
$ 
```

This shows that the operands of the `f32.minimumnum` are sNaN and the result is incorrectly also sNaN.

IEEE 754-2019 says of its corresponding `minimumNumber` operattion "If both operands are NaNs, a quiet NaN is returned".

I have not tested similar variants for f64, maximumnum, or other architectures.



___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 123239] Unnecessarily large constant created from reordering add and shift

2025-01-16 Thread LLVM Bugs via llvm-bugs



Issue

123239




Summary

Unnecessarily large constant created from reordering add and shift




  Labels
  
new issue
  



  Assignees
  
  



  Reporter
  
  dzaima
  




https://godbolt.org/z/xoKf6bnTb

The code:
```c
#include
#include
bool foo(uint64_t x) {
  uint16_t tag = x>>48;
  return tag>=0b0010 && tag<=0b0100;
}
```
with `-O3` as of clang 19 (and still in trunk) compiles to:
```asm
foo:
movabs  rax, 3940649673949184
 add rax, rdi
shr rax, 48
cmp eax, 3
 setbal
ret
```
whereas 18.0 did this, which is strictly better (i.e. is the exact same set of instructions, just in a different order and without movabs):
```asm
foo:
shr rdi, 48
add edi, -65522
cmp edi, 3
setbal
ret
```


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 123249] [clang] Compiler crash with "echo 'a; b() { __atomic_test_and_set(a, b); }' | ./clang -cc1 -emit-llvm -o -"

2025-01-16 Thread LLVM Bugs via llvm-bugs



Issue

123249




Summary

[clang] Compiler crash with "echo 'a; b() { __atomic_test_and_set(a, b); }' | ./clang -cc1 -emit-llvm -o -"




  Labels
  
clang
  



  Assignees
  
  



  Reporter
  
  thurstond
  




Using clang built from today's source:
```
commit a98df676140c9b3e44f6e094df40d49f53e9a89c (HEAD -> main, upstream/main, upstream/HEAD)
Date:   Thu Jan 16 14:00:42 2025 -0800
```

and running this command:
```
$ echo 'a; b() { __atomic_test_and_set(a, b); }' | ./clang -cc1 -emit-llvm -o -
```

crashes the compiler:
```
:1:1: error: type specifier missing, defaults to 'int'; ISO C99 and later do not support implicit int [-Wimplicit-int]
1 | a; b() { __atomic_test_and_set(a, b); }
  | ^
  | int
:1:4: error: type specifier missing, defaults to 'int'; ISO C99 and later do not support implicit int [-Wimplicit-int]
1 | a; b() { __atomic_test_and_set(a, b); }
  |^
  |int
:1:32: error: incompatible integer to pointer conversion passing 'int' to parameter of type 'volatile void *' [-Wint-conversion]
1 | a; b() { __atomic_test_and_set(a, b); }
  | ^
:1:35: error: incompatible pointer to integer conversion passing 'int ()' to parameter of type 'int' [-Wint-conversion]
1 | a; b() { __atomic_test_and_set(a, b); }
  | ^
clang: /usr/local/google/home/thurston/llvm-projectG/clang/include/clang/AST/Type.h:8810: const T *clang::Type::castAs() const [T = clang::PointerType]: Assertion `isa(CanonicalType)' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: /usr/local/google/home/thurston/llvm-projectG/build/bin/clang -cc1 -emit-llvm -o -
1.	 parser at end of file
2.	:1:4: LLVM IR generation of declaration 'b'
3.	:1:4: Generating code for declaration 'b'
 #0 0x5608f980d9a1 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /usr/local/google/home/thurston/llvm-projectG/llvm/lib/Support/Unix/Signals.inc:798:11
 #1 0x5608f980de9b PrintStackTraceSignalHandler(void*) /usr/local/google/home/thurston/llvm-projectG/llvm/lib/Support/Unix/Signals.inc:874:1
 #2 0x5608f980be96 llvm::sys::RunSignalHandlers() /usr/local/google/home/thurston/llvm-projectG/llvm/lib/Support/Signals.cpp:105:5
 #3 0x5608f980e635 SignalHandler(int) /usr/local/google/home/thurston/llvm-projectG/llvm/lib/Support/Unix/Signals.inc:415:1
 #4 0x7f0859056590 (/lib/x86_64-linux-gnu/libc.so.6+0x3f590)
 #5 0x7f08590a53ac __pthread_kill_implementation ./nptl/pthread_kill.c:44:76
 #6 0x7f08590564f2 raise ./signal/../sysdeps/posix/raise.c:27:6
 #7 0x7f085903f4ed abort ./stdlib/abort.c:81:7
 #8 0x7f085903f415 _nl_load_domain ./intl/loadmsgcat.c:1177:9
 #9 0x7f085904f012 (/lib/x86_64-linux-gnu/libc.so.6+0x38012)
#10 0x5608f9df5bb3 clang::PointerType const* clang::Type::castAs() const /usr/local/google/home/thurston/llvm-projectG/clang/include/clang/AST/Type.h:0:3
#11 0x5608fa372b7f clang::CodeGen::CodeGenFunction::EmitBuiltinExpr(clang::GlobalDecl, unsigned int, clang::CallExpr const*, clang::CodeGen::ReturnValueSlot) /usr/local/google/home/thurston/llvm-projectG/clang/lib/CodeGen/CGBuiltin.cpp:5135:16
#12 0x5608f9de34d2 clang::CodeGen::CodeGenFunction::EmitCallExpr(clang::CallExpr const*, clang::CodeGen::ReturnValueSlot, llvm::CallBase**) /usr/local/google/home/thurston/llvm-projectG/clang/lib/CodeGen/CGExpr.cpp:5607:12
#13 0x5608f9e97918 (anonymous namespace)::ScalarExprEmitter::VisitCallExpr(clang::CallExpr const*) /usr/local/google/home/thurston/llvm-projectG/clang/lib/CodeGen/CGExprScalar.cpp:627:36
#14 0x5608f9e8e871 clang::StmtVisitorBase::Visit(clang::Stmt*) /usr/local/google/home/thurston/llvm-projectG/build/tools/clang/include/clang/AST/StmtNodes.inc:614:1
#15 0x5608f9e83435 (anonymous namespace)::ScalarExprEmitter::Visit(clang::Expr*) /usr/local/google/home/thurston/llvm-projectG/clang/lib/CodeGen/CGExprScalar.cpp:448:52
#16 0x5608f9e8326a clang::CodeGen::CodeGenFunction::EmitScalarExpr(clang::Expr const*, bool) /usr/local/google/home/thurston/llvm-projectG/clang/lib/CodeGen/CGExprScalar.cpp:5590:3
#17 0x5608f9dc03e9 clang::CodeGen::CodeGenFunction::EmitAnyExpr(clang::Expr const*, clang::CodeGen::AggValueSlot, bool) /usr/local/google/home/thurston/llvm-projectG/clang/lib/CodeGen/CGExpr.cpp:242:24
#18 0x5608f9dc0299 clang::CodeGen::CodeGenFunction::EmitIgnoredExpr(clang::Expr const*) /usr/local/google/home/thurston/llvm-projectG/clang/lib/CodeGen/CGExpr.cpp:217:5
#19 0x5608f9fc7540 clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef) /usr/local/google/home/thurston/llvm-projectG/clang/lib/CodeGen/CGStmt.cpp:129:5
#20 0x5608f9fd1901 clang::CodeGen::CodeGenFunction::EmitCompoundStmtWitho

[llvm-bugs] [Bug 123248] [RISCV64] ld.lld: error: relaxation not converged

2025-01-16 Thread LLVM Bugs via llvm-bugs



Issue

123248




Summary

[RISCV64] ld.lld: error: relaxation not converged




  Labels
  
lld
  



  Assignees
  
  



  Reporter
  
  appujee
  




Steps to repro

```
clone aosp-main-with-phones
$ source build/envsetup.sh
$ lunch aosp_cf_riscv64_phone-trunk_staging-userdebug 
$ m m net_test_stack

FAILED: out/soong/.intermediates/packages/modules/Bluetooth/system/stack/net_test_stack/android_riscv64_cfi/unstripped/net_test_stack64
prebuilts/clang/host/linux-x86/clang-r536225/bin/clang++ out/soong/.intermediates/bionic/libc/crtbegin_dynamic/android_riscv64/crtbegin_dynamic.o @out/soong/.intermediates/packages/modules/Bluetooth/system/stack/net_test_stack/android_riscv64_cfi/unstripped/net_test_stack64.rsp out/soong/.intermediates/bionic/libc/crtend_android/android_riscv64/crtend_android.o -o out/soong/.intermediates/packages/modules/Bluetooth/system/stack/net_test_stack/android_riscv64_cfi/unstripped/net_test_stack64 -target riscv64-linux-android1 -Wl,-z,noexecstack -Wl,-z,relro -Wl,-z,now -Wl,--build-id=md5 -Wl,--fatal-warnings -Wl,--no-undefined-version -Wl,--exclude-libs,libgcc.a -Wl,--exclude-libs,libgcc_stripped.a -Wl,--exclude-libs,libunwind_llvm.a -Wl,--exclude-libs,libunwind.a -fuse-ld=lld -Wl,--icf=safe -Wl,--no-demangle -Wl,--compress-debug-sections=zstd -Wl,--pack-dyn-relocs=android+relr -Wl,--no-undefined -march=rv64gcv_zba_zbb_zbs -Wl,-mllvm -Wl,-jump-is-expensive=false -Wl,-z,max-page-size=4096   -pie -nostdlib -Bdynamic -Wl,--gc-sections -Wl,-z,nocopyreloc -Wl,-rpath,\$ORIGIN -flto -fsanitize-cfi-cross-dso -fsanitize=cfi -Wl,-plugin-opt,O1 -fsanitize=bounds,cfi -fno-sanitize-link-runtime -Wl,--exclude-libs=libclang_rt.builtins-riscv64-android.a -Wl,--exclude-libs=libclang_rt.ubsan_minimal-riscv64-android.a -Wl,-dynamic-linker,/system/bin/linker64
```


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 123241] [Clang] Missing AddressSpaceCast on CXX PointerToMember global.

2025-01-16 Thread LLVM Bugs via llvm-bugs



Issue

123241




Summary

[Clang] Missing AddressSpaceCast on CXX PointerToMember global.




  Labels
  
clang
  



  Assignees
  
  



  Reporter
  
  jhuber6
  




The following code crashed when run on an NVIDIA or AMD GPU due to a missing address space cast https://godbolt.org/z/3vx6avrT6.
```c++
struct S {
 int x;
};

[[clang::loader_uninitialized]] S [[clang::address_space(3)]] s;

int &lookup(int S::*in) {
return s.*in;
}
```

The generated IR accesses the global `s` but does not emit an address space cast  to the generic address space. We do not emit an address space cast, because it is missing from the AST like should normally be applied prior to the `ReturnStmt`.


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 123231] Erroneous "use of infinity" warning

2025-01-16 Thread LLVM Bugs via llvm-bugs



Issue

123231




Summary

Erroneous "use of infinity" warning




  Labels
  
clang:diagnostics
  



  Assignees
  
  



  Reporter
  
  ahatanak
  




clang incorrectly emits a warning when a method called infinity is called.

$ cat test.cpp
```
double infinity() { return 0; }

int main() {
   return infinity();
}
```

$ clang++ -ffast-math test.cpp -c
test.cpp:4:11: warning: use of infinity is undefined behavior due to the currently enabled floating-point options [-Wnan-infinity-disabled]
4 |return infinity();


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 123262] [clang++][aarch64] help optimize __builtin_mul_overflow performance

2025-01-16 Thread LLVM Bugs via llvm-bugs



Issue

123262




Summary

[clang++][aarch64] help optimize __builtin_mul_overflow performance




  Labels
  
clang
  



  Assignees
  
  



  Reporter
  
  eric-yq
  




Hi team, I have a sample code compiling with clang++, it shows 10 times slower than g++.
The main performance issue is located in function `__builtin_mul_overflow under clang++`
Can you help give some suggestions ? I do not want to use both g++ and clang++ in my CICD pipeline.

Compiling command and `Time taken` comparation:  `( 0.22 seconds vs. 0.02 seconds. )`
```c
# Ubuntu 24.04, g++ 13.3 and clang++ 18.1.3
# Server：AWS c7g.xlarge(AWS Graviton3, Neoverse-V1)

# g++ -std=c++17 -O3 -march=armv8-a+crc testint.cpp -o testint-g++
# ./testint-g++ 
Time taken for 1000 iterations: 0.0208047 seconds
Sum of results: 9747553088193654009

# clang++ -std=c++17 -O3 -march=armv8-a+crc testint.cpp -o testint-clang++ --rtlib=compiler-rt
# ./testint-clang++ 
Time taken for 1000 iterations: 0.226598 seconds  ( 0.22 seconds vs. 0.02 seconds. )
Sum of results: 18269431752893742105
```

Sample code: testint.cpp
```c
#include 
#include 
#include 
#include 
#include 
// 定义 128 位整数类型（如果编译器支持）
using int128_t = __int128;
// 被基准测试的函数
inline bool int128_mul_overflow(int128_t a, int128_t b, volatile int128_t* c) {
return __builtin_mul_overflow(a, b, c);
}
// 随机生成 128 位整数
int128_t generate_random_int128() {
static std::mt19937_64 rng(std::random_device{}());
std::uniform_int_distribution dist(0, std::numeric_limits::max());
// 生成两个 64 位整数，并将它们组合成一个 128 位整数
int128_t high = static_cast(dist(rng));
int128_t low = static_cast(dist(rng));
return (high << 64) | low;
}
// 生成随机数据并存储在 vector 中
std::vector> generate_random_data(int count) {
std::vector> data;
data.reserve(count);
for (int i = 0; i < count; ++i) {
int128_t a = generate_random_int128();
int128_t b = generate_random_int128();
data.emplace_back(a, b);
}
return data;
}
// 基准测试函数
void benchmark_int128_mul_overflow(const std::vector>& data) {
int128_t c = 0; 
int128_t sum = 0; // 用于累加结果
auto start = std::chrono::high_resolution_clock::now();
for (const auto& pair : data) {
if (int128_mul_overflow(pair.first, pair.second, &c)) {
sum += c; // 累加结果以防止优化
}
}
auto end = std::chrono::high_resolution_clock::now();
std::chrono::duration duration = end - start;
std::cout << "Time taken for " << data.size() << " iterations: " << duration.count() << " seconds\n";
std::cout << "Sum of results: " << static_cast(sum) << "\n"; // 输出累加结果
}
int main() {
int iterations = 1000; // 可以根据需要调整迭代次数
auto data = ""
benchmark_int128_mul_overflow(data);
return 0;
}
```


___
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 123175] Mul reassociation in instcombine does not maintain NSW (for example impacting alias analysis negatively when canonicalizing GEP)

[llvm-bugs] [Bug 123179] [clang-format] Macro formatting regression 19.1.6 vs 19.1.7

[llvm-bugs] [Bug 123189] [MASM] SIGSEGV in `checkForValidSection` in MasmParser

[llvm-bugs] [Bug 123195] PointerIntPair.h:172:17: error: static assertion failed due to requirement

[llvm-bugs] [Bug 123198] False positive in bugprone-string-constructor

[llvm-bugs] [Bug 123203] [Clang][OpenCL] Compiler crash on __builtin_assume_aligned in OpenCL

[llvm-bugs] [Bug 123224] [libc++] regression: new/delete symbol overrides broken on macOS

[llvm-bugs] [Bug 123212] [AMDGPU][GISel] Missing (or not running) combine for `sra workitem.id.xx, 31`

[llvm-bugs] [Bug 123208] AMDGPU silently converts incorrect physical register asm constraint to virtual register

[llvm-bugs] [Bug 123214] [flang] The Fortran test cases for hdf5-1.10.6 cannot be built with Flang

[llvm-bugs] [Bug 123201] [LV][EVL] Support interleaved accesses for EVL tail folding.

[llvm-bugs] [Bug 123227] Clang static analyzer false positive suppression does not suppress an issue report

[llvm-bugs] [Bug 123296] Clang crash when trying to evaluate a constexpr with `auto` type, variadic template before type is fully defined

[llvm-bugs] [Bug 123278] Missed optimization between `range` parameter metadata and `assume`s

[llvm-bugs] [Bug 123291] Conversion of Affine Loops to GPU Dialect Fails with 'Invalid Dimension or Symbol Identifier'

[llvm-bugs] [Bug 123246] [mlir] -remove-dead-values crashes on scf.if for empty region

[llvm-bugs] [Bug 123263] `@llvm.minimumnum.f32` returns sNaN instead of qNaN on x86_64

[llvm-bugs] [Bug 123239] Unnecessarily large constant created from reordering add and shift

[llvm-bugs] [Bug 123249] [clang] Compiler crash with "echo 'a; b() { __atomic_test_and_set(a, b); }' | ./clang -cc1 -emit-llvm -o -"

[llvm-bugs] [Bug 123248] [RISCV64] ld.lld: error: relaxation not converged

[llvm-bugs] [Bug 123241] [Clang] Missing AddressSpaceCast on CXX PointerToMember global.

[llvm-bugs] [Bug 123231] Erroneous "use of infinity" warning

[llvm-bugs] [Bug 123262] [clang++][aarch64] help optimize __builtin_mul_overflow performance

23 matches

Site Navigation

Mail list logo

Footer information