[llvm-bugs] [Bug 122872] Add `vpavgb` and `vpavgw` patterns
Issue 122872 Summary Add `vpavgb` and `vpavgw` patterns Labels new issue Assignees Reporter Validark [Godbolt link](https://zig.godbolt.org/#g:!((g:!((g:!((h:codeEditor,i:(filename:'1',fontScale:14,fontUsePx:'0',j:3,lang:zig,selection:(endColumn:1,endLineNumber:1,positionColumn:1,positionLineNumber:1,selectionStartColumn:1,selectionStartLineNumber:1,startColumn:1,startLineNumber:1),source:'export+fn+foo(a:+@Vector(64,+u8),+b:+@Vector(64,+u8))+@Vector(64,+u8)+%7B%0Areturn+(a+%2B+b+%2B+@as(@TypeOf(a),+@splat(1)))+%3E%3E+@splat(1)%3B%0A%7D%0A'),l:'5',n:'0',o:'Zig+source+%233',t:'0')),header:(),k:53.18930041152264,l:'4',m:100,n:'0',o:'',s:0,t:'0'),(g:!((h:compiler,i:(compiler:ztrunk,filters:(b:'0',binary:'1',binaryObject:'1',commentOnly:'0',debugCalls:'1',demangle:'0',directives:'0',execute:'1',intel:'0',libraryCode:'0',trim:'1',verboseDemangling:'0'),flagsViewOpen:'1',fontScale:14,fontUsePx:'0',j:1,lang:zig,libs:!(),options:'-O+ReleaseFast+-target+x86_64-linux+-mcpu%3Dznver5',overrides:!(),selection:(endColumn:1,endLineNumber:1,positionColumn:1,positionLineNumber:1,selectionStartColumn:1,selectionStartLineNumber:1,startColumn:1,startLineNumber:1),source:3),l:'5',n:'0',o:'+zig+trunk+(Editor+%233)',t:'0')),header:(),k:46.81069958847738,l:'4',m:100,n:'0',o:'',s:0,t:'0')),l:'2',n:'0',o:'',t:'0')),version:4) ```zig export fn foo(a: @Vector(64, u8), b: @Vector(64, u8)) @Vector(64, u8) { return (a + b + @as(@TypeOf(a), @splat(1))) >> @splat(1); } ``` Gives: ```asm .LCPI0_1: .byte 0 .byte 128 .byte 64 .byte 32 .byte 16 .byte 8 .byte 4 .byte 2 foo: vpaddb zmm0, zmm0, zmm1 vpternlogd zmm1, zmm1, zmm1, 255 vpsubb zmm0, zmm0, zmm1 vgf2p8affineqb zmm0, zmm0, qword ptr [rip + .LCPI0_1]{1to8}, 0 ret ``` Could probably give the `vpavgb` instruction? ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 122869] [MLIR] Bufferization behaving differently based on the position of extract_slice.
Issue 122869 Summary [MLIR] Bufferization behaving differently based on the position of extract_slice. Labels mlir:bufferization Assignees Reporter pashu123 **Input IR:** ``` func.func @test_one(%arg0: index, %arg1: vector<64x64xf32>, %arg2: tensor<2x4096x10x64xf16>) -> tensor<1x64x1x64xf16> { %c0 = arith.constant 0 : index %0 = tensor.empty() : tensor<64x64xf16> %1 = arith.truncf %arg1 : vector<64x64xf32> to vector<64x64xf16> %2 = vector.transfer_write %1, %0[%c0, %c0] {in_bounds = [true, true]} : vector<64x64xf16>, tensor<64x64xf16> %extracted_slice = tensor.extract_slice %arg2[%arg0, 0, 0, 0] [1, 64, 1, 64] [1, 1, 1, 1] : tensor<2x4096x10x64xf16> to tensor<1x64x1x64xf16> %inserted_slice = tensor.insert_slice %2 into %extracted_slice[0, 0, 0, 0] [1, 64, 1, 64] [1, 1, 1, 1] : tensor<64x64xf16> into tensor<1x64x1x64xf16> return %inserted_slice : tensor<1x64x1x64xf16> } func.func @test_two(%arg0: index, %arg1: vector<64x64xf32>, %arg2: tensor<2x4096x10x64xf16>) -> tensor<1x64x1x64xf16> { %c0 = arith.constant 0 : index %0 = tensor.empty() : tensor<64x64xf16> %1 = arith.truncf %arg1 : vector<64x64xf32> to vector<64x64xf16> %extracted_slice = tensor.extract_slice %arg2[%arg0, 0, 0, 0] [1, 64, 1, 64] [1, 1, 1, 1] : tensor<2x4096x10x64xf16> to tensor<1x64x1x64xf16> %2 = vector.transfer_write %1, %0[%c0, %c0] {in_bounds = [true, true]} : vector<64x64xf16>, tensor<64x64xf16> %inserted_slice = tensor.insert_slice %2 into %extracted_slice[0, 0, 0, 0] [1, 64, 1, 64] [1, 1, 1, 1] : tensor<64x64xf16> into tensor<1x64x1x64xf16> return %inserted_slice : tensor<1x64x1x64xf16> } ``` **Command**: `mlir-opt above.mlir -eliminate-empty-tensors -canonicalize` **Output** ``` module { func.func @test_one(%arg0: index, %arg1: vector<64x64xf32>, %arg2: tensor<2x4096x10x64xf16>) -> tensor<1x64x1x64xf16> { %c0 = arith.constant 0 : index %0 = tensor.empty() : tensor<64x64xf16> %1 = arith.truncf %arg1 : vector<64x64xf32> to vector<64x64xf16> %2 = vector.transfer_write %1, %0[%c0, %c0] {in_bounds = [true, true]} : vector<64x64xf16>, tensor<64x64xf16> %extracted_slice = tensor.extract_slice %arg2[%arg0, 0, 0, 0] [1, 64, 1, 64] [1, 1, 1, 1] : tensor<2x4096x10x64xf16> to tensor<1x64x1x64xf16> %inserted_slice = tensor.insert_slice %2 into %extracted_slice[0, 0, 0, 0] [1, 64, 1, 64] [1, 1, 1, 1] : tensor<64x64xf16> into tensor<1x64x1x64xf16> return %inserted_slice : tensor<1x64x1x64xf16> } func.func @test_two(%arg0: index, %arg1: vector<64x64xf32>, %arg2: tensor<2x4096x10x64xf16>) -> tensor<1x64x1x64xf16> { %c0 = arith.constant 0 : index %0 = arith.truncf %arg1 : vector<64x64xf32> to vector<64x64xf16> %extracted_slice = tensor.extract_slice %arg2[%arg0, 0, 0, 0] [1, 64, 1, 64] [1, 1, 1, 1] : tensor<2x4096x10x64xf16> to tensor<1x64x1x64xf16> %extracted_slice_0 = tensor.extract_slice %extracted_slice[0, 0, 0, 0] [1, 64, 1, 64] [1, 1, 1, 1] : tensor<1x64x1x64xf16> to tensor<64x64xf16> %1 = vector.transfer_write %0, %extracted_slice_0[%c0, %c0] {in_bounds = [true, true]} : vector<64x64xf16>, tensor<64x64xf16> %inserted_slice = tensor.insert_slice %1 into %extracted_slice[0, 0, 0, 0] [1, 64, 1, 64] [1, 1, 1, 1] : tensor<64x64xf16> into tensor<1x64x1x64xf16> return %inserted_slice : tensor<1x64x1x64xf16> } ``` The only difference between `test_one` and `test_two` is the placement of `tensor.extract_slice`. test_one doesn't get rid of the empty buffer, whereas test_two gets rid of the empty buffer and reuses the extracted slice. @matthias-springer Could you suggest what would be happening here? Do you know if it is the intended behaviour? ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 122891] Running lowermodulelds multiple times needs to be thinlto only, breaking fortran at present
Issue 122891 Summary Running lowermodulelds multiple times needs to be thinlto only, breaking fortran at present Labels new issue Assignees arsenm, ergawy, JonChesterfield, Pierre-vh, jhuber6 Reporter JonChesterfield As of https://github.com/llvm/llvm-project/pull/85626 and https://github.com/llvm/llvm-project/pull/75333 the lowermodulelds codegen pass is run as part of LTO. That doesn't work - the pass was designed to run once as part of codegen where it can globally allocate variables to the LDS space. There is a narrow exemption carved out to unblock thinlto - provided the IR module is carved up into independent modules prior to codegen, no calls or references between them, running the allocator on each subgraph works ok and a second run during codegen over the entire module is a no-op. There's a partial check to notice when that invariant is breached - if the input IR has some allocated variables and some non-allocated variables, the pass aborts. That's the error message which @ergawy reported to me for Fortran. I think the pass should be added to the thinlto pipeline and removed from the full lto pipeline, and generally only run once during codegen except for the thinlto case. Arguments could be made that the pass should cope with being run multiple times on various bits of IR and spliced together. The problem with running on subgraphs is the reachable test can't be done - we don't know if a call to an external function accesses some visible LDS, or if a function can be called by an external kernel. A correct lowering then looks like a lot of table lookups and overallocation, at which point the user is likely to discover they've run out of LDS and/or have occupancy problems. Tagging Joseph as well since I think he moved the openmp pipeline to use LTO at some point, in which case that's also vulnerable to this pattern. I suspect we're getting by at present because O0 doesn't get much use and most compilation flows are straightforward, e.g. they don't run opt on bits of IR manually. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 122892] cannot compile this l-value expression yet
Issue 122892 Summary cannot compile this l-value _expression_ yet Labels new issue Assignees Reporter egorpugin ``` clang++ -v clang version 20.0.0git (https://github.com/llvm/llvm-project c047a5b3f6e2295dd74f1e8f17f1a023150b246c) Target: x86_64-pc-windows-msvc Thread model: posix ``` `clang++ main.cpp -std=c++20` ``` struct A { void f() { l(r); } static void r() {} static auto l(auto &&f) {}// error //static void l(auto &&f) {}// ok }; int main() { A k; k.f(); } ``` ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 122929] [DependenceAnalysis] Missing dependency detection for instructions accessing same memory location on different iterations
Issue 122929 Summary [DependenceAnalysis] Missing dependency detection for instructions accessing same memory location on different iterations Labels new issue Assignees Reporter 1997alireza Dependence analysis is capable of checking the dependency of an instruction with itself but there is a bug there; If the dependency occurs by an instruction through accessing the same memory location in different loop iterations, DA is not able to detect the dependency. For example, the _expression_ `A[i - j]` where `i = j = 0` and `i = j = 1` accesses the same memory location at different iterations, but the current analysis fails to detect this dependency, which will lead to incorrect analysis in some optimization passes such as loop-interchange. Consider the following code: ```cpp int main() { const int N = 21; int A[N] = {0}; int *B = &A[10]; for (int i = 1; i <= 10; i++) for (int j = 10; j >= 1; j--) B[i - j] = 2 * i - j; return 0; } ``` This example can pass the legality check in loop interchange pass based on the information provided by DA, which would give incorrect results if loop interchange was performed. In the original code, the final value is stored in B[0] when `i = j = 10`, so B[0] is set to 10. After loop interchange, the last access to B[0] happens at `i = j = 1`, which incorrectly sets B[0] to 1. The issue arises from an incorrect reasoning in the depends() API. There are two variables in the depends() API that are involved in this issue. First one is `PossiblyLoopIndependent` which will be set to false if the src and dst cannot have loop-independent dependency. Another variable is `AllEqual` that is computed within the API and is set to true if the dependency vector elements are all `Equals`. For our case, when we analyse the dependency of the B[i-j] store instruction with itself, PossiblyLoopIndependent is set to false, and AllEqual is true. According to [the API](https://github.com/llvm/llvm-project/blob/19557a4c8fab0dbfe9d9c53b99b7960ef211684e/llvm/lib/Analysis/DependenceAnalysis.cpp#L3970) if AllEqual == True and PossiblyLoopIndependent == false, DA concludes there is no dependency, which is not correct and has caused the bug. Additionally, by modifying a test case in DA, we can expose the same bug. In [this test case](https://github.com/llvm/llvm-project/blob/19557a4c8fab0dbfe9d9c53b99b7960ef211684e/llvm/test/Analysis/DependenceAnalysis/ExactRDIV.ll#L583), storing to `A[11*i - j]` has no dependency with itself. By modifying the subscript to `A[10*i - j]`, the memory location `A[0]` is accessed in iterations `i = j = 0` and `i = 1, j = 10`. Still, DA does not detect any dependency. \+ In addition to this bug, it seems that the PossiblyLoopIndependent variable has lost its intended purpose. This flag is always set to true in all cases when depends() is called, hence we may want to reconsider the utility of this variable and possibly remove it from the function signature entirely. We can post a patch for that. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 122932] [HLSL] sizeof(bool) should always be 4
Issue 122932 Summary [HLSL] sizeof(bool) should always be 4 Labels new issue Assignees spall Reporter spall Currently HLSL functions which return or take a bool, do so with bool as an i1, but bool should always be an i32. Related the i1 is extended to an i8 in the function body, which is an error. https://hlsl.godbolt.org/z/Ksqsva9q3 ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 122934] [TySan] possible false positive with memcpy()?
Issue 122934 Summary [TySan] possible false positive with memcpy()? Labels false-positive Assignees Reporter seanm I have a TySan report that I think/thought is a false positive. So I used creduce to try and minimize it. It reduced down to the following: ```c typedef struct { int a; int b; int c; int i; int d; int e; int f; int dim[8]; long g } h; int k, m = __builtin_object_size(&k, 0); h b; void j(); void n(void *o) { __builtin___memcpy_chk(&k, o, sizeof(0), m); k; // TySan complains here, line 17 } void calloc(); h *c = calloc; void l() { c->g; b = *c; j(0, b); } void j(int, void *p) { n(p + 64); } void main() { l(); } ``` TySan reports: ``` ==79752==ERROR: TypeSanitizer: type-aliasing-violation on address 0x000100c1c010 (pc 0x000100c168ec bp 0x00016f1eac30 sp 0x00016f1ea3b0 tid 32786625) READ of size 4 at 0x000100c1c010 with type int accesses an existing object of type long (in at offset 64) #0 0x000100c168e8 in n test-preprocessed.c:17 ``` Line 17 looks like it does nothing, but indeed if I comment it out, TySan no longer warns. This reduced code is pretty nonsensical of course, but in the real code, it's using memcpy() precisely to avoid strict aliasing violations. Isn't memcpy() supposed to be the kosher way to copy anything of any size/alignment to anything else? Also, and maybe I should make another ticket, but it's a shame the report says: - `(in at offset 64)` instead of: - `(in 'struct h' at offset 64, field 'g')` ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 122937] llvm/lib/CodeGen/LiveRangeUtils.h:40: Iterators applied to wrong data structure ?
Issue 122937 Summary llvm/lib/CodeGen/LiveRangeUtils.h:40: Iterators applied to wrong data structure ? Labels llvm:codegen, code-quality Assignees Reporter dcb314 Static analyser cppcheck says: llvm/lib/CodeGen/LiveRangeUtils.h:40:6: error: Same iterator is used with different containers 'LR' and 'LR.segments'. [iterators1] Source code is typename LiveRangeT::iterator J = LR.begin(), E = LR.end(); // ... LR.segments.erase(J, E); So J and E are iterators for LR, but get used on LR.segments. I am surprised this compiles. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 122905] [GISel] Missing combines to expose common subexpressions and eliminate them
Issue 122905 Summary [GISel] Missing combines to expose common subexpressions and eliminate them Labels llvm:globalisel Assignees Reporter qcolombet When running GISel on the given input IR, we end up generating essentially a one-to-one mapping with the input IR whereas most of the computation is just duplicated. SDISel performs a much better job by doing `instcombine`-like optimizations that allow it to simplify the IR and eventually exposes the CSE opportunity. This is not that surprising given that historically GISel has had a garbage in garbage out approach, but it may make sense to strengthen GISel combines for optimized builds. Note: I observed this with AMDGPU but I suspect it affects all backends. # To Reproduce # Download the attached IR or copy paste the snippet below and run ```bash llc -O3 -march=amdgcn -mcpu=gfx942 -mtriple amdgcn-amd-hmcsa -global-isel=<0|1> repro.ll -o isel.s ``` # Result # GISel ends up generating two `fdiv` instructions whereas SDISel is able to CSE the whole thing and produces just one. GISel: ```asm v_div_scale_f32 v1, s[0:1], v0, v0, 1.0 v_mov_b32_e32 v7, v2 v_mov_b32_e32 v2, v3 v_mov_b32_e32 v3, v4 v_rcp_f32_e32 v4, v1 v_div_scale_f32 v5, vcc, 1.0, v0, 1.0 v_fma_f32 v8, -v1, v4, 1.0 v_fmac_f32_e32 v4, v8, v4 v_mul_f32_e32 v8, v5, v4 v_fma_f32 v9, -v1, v8, v5 v_fmac_f32_e32 v8, v9, v4 v_fma_f32 v1, -v1, v8, v5 v_div_fmas_f32 v5, v1, v4, v8 v_div_fixup_f32 v5, v5, v0, 1.0 v_div_fmas_f32 v1, v1, v4, v8 v_div_fixup_f32 v0, v1, v0, 1.0 ``` SDISel: ```asm v_div_scale_f32 v1, s[0:1], v0, v0, 1.0 v_rcp_f32_e32 v6, v1 s_nop 0 v_fma_f32 v7, -v1, v6, 1.0 v_fmac_f32_e32 v6, v7, v6 v_div_scale_f32 v7, vcc, 1.0, v0, 1.0 v_mul_f32_e32 v8, v7, v6 v_fma_f32 v9, -v1, v8, v7 v_fmac_f32_e32 v8, v9, v6 v_fma_f32 v1, -v1, v8, v7 v_div_fmas_f32 v1, v1, v6, v8 v_div_fixup_f32 v0, v1, v0, 1.0 ``` # IR Snippet # ```llvm define void @foo(<1 x float> %in, ptr %out, ptr %out2) { %t1174 = insertvalue [4 x <1 x float>] zeroinitializer, <1 x float> %in, 0 %t1175 = insertvalue [4 x <1 x float>] %t1174, <1 x float> %in, 1 %t1176 = insertvalue [4 x <1 x float>] %t1175, <1 x float> %in, 2 %t1177 = insertvalue [4 x <1 x float>] %t1176, <1 x float> %in, 3 %t1178 = insertvalue [2 x [1 x [4 x [1 x [4 x <1 x float>] zeroinitializer, [4 x <1 x float>] %t1177, 0, 0, 0, 0 %t1179 = insertvalue [2 x [1 x [4 x [1 x [4 x <1 x float>] %t1178, [4 x <1 x float>] %t1177, 0, 0, 1, 0 %t1180 = insertvalue [2 x [1 x [4 x [1 x [4 x <1 x float>] %t1179, [4 x <1 x float>] %t1177, 0, 0, 2, 0 %t1181 = insertvalue [2 x [1 x [4 x [1 x [4 x <1 x float>] %t1180, [4 x <1 x float>] %t1177, 0, 0, 3, 0 %t1182 = insertvalue [2 x [1 x [4 x [1 x [4 x <1 x float>] %t1181, [4 x <1 x float>] %t1177, 1, 0, 0, 0 %t1183 = insertvalue [2 x [1 x [4 x [1 x [4 x <1 x float>] %t1182, [4 x <1 x float>] %t1177, 1, 0, 1, 0 %t1184 = insertvalue [2 x [1 x [4 x [1 x [4 x <1 x float>] %t1183, [4 x <1 x float>] %t1177, 1, 0, 2, 0 %t1185 = insertvalue [2 x [1 x [4 x [1 x [4 x <1 x float>] %t1184, [4 x <1 x float>] %t1177, 1, 0, 3, 0 %t1186 = extractvalue [2 x [1 x [4 x [1 x [4 x <1 x float>] %t1178, 0, 0, 0, 0, 0 %t1187 = fdiv <1 x float> splat (float 1.00e+00), %t1186 %t1188 = extractvalue [2 x [1 x [4 x [1 x [4 x <1 x float>] %t1178, 0, 0, 0, 0, 1 %t1189 = fdiv <1 x float> splat (float 1.00e+00), %t1188 store <1 x float> %t1187, ptr %out store <1 x float> %t1189, ptr %out2 ret void } ``` # Note # SDISel is essentially able to do the equivalent of: ```bash opt -passes=instcombine,early-cse -S repro.ll -o - ``` Which yields: ```llvm define void @foo(<1 x float> %in, ptr %out, ptr %out2) { %t1187 = fdiv <1 x float> splat (float 1.00e+00), %in store <1 x float> %t1187, ptr %out, align 4 store <1 x float> %t1187, ptr %out2, align 4 ret void } ``` [repro.ll.txt](https://github.com/user-attachments/files/18411275/repro.ll.txt) ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 122913] [SCEV] Another SEGV/stack overflow in LoopGuards
Issue 122913 Summary [SCEV] Another SEGV/stack overflow in LoopGuards Labels llvm:SCEV, crash-on-valid Assignees juliannagele Reporter danilaml Similar to https://github.com/llvm/llvm-project/issues/120615. Looks like the fix wasn't a complete one. Here is an example: ```llvm target triple = "x86_64-unknown-linux-gnu" define ptr @f(i32 %0) { switch i32 0, label %bb4 [ i32 1, label %bb4 i32 2, label %bb4 i32 3, label %bb4 i32 4, label %bb1 i32 5, label %bb4 i32 6, label %bb4 ] bb: ; No predecessors! switch i32 0, label %bb4 [ i32 0, label %bb4 i32 1, label %bb1 ] bb1: ; preds = %bb2, %bb, %1 %2 = phi i32 [ %3, %bb2 ], [ 0, %bb ], [ 0, %1 ] switch i32 %0, label %bb3 [ i32 0, label %bb2 i32 1, label %bb2 i32 2, label %bb2 ] bb2: ; preds = %bb1, %bb1, %bb1 %3 = add i32 %2, 1 %4 = icmp ult i32 %0, 0 br i1 %4, label %bb1, label %bb4 bb3: ; preds = %bb1 unreachable bb4: ; preds = %bb2, %bb, %bb, %1, %1, %1, %1, %1, %1 ret ptr null } ``` Crashes with the same command line `opt -passes=nary-reassociate --scalar-evolution-use-expensive-range-sharpening` godbolt: https://godbolt.org/z/4d3jo8jTz ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 122925] [libc++] Consider enabling fast hardening mode (or extensive) when optimizations are disabled
Issue 122925 Summary [libc++] Consider enabling fast hardening mode (or extensive) when optimizations are disabled Labels libc++, hardening Assignees ldionne Reporter ldionne We can take for granted that someone building with `-O0` is doing some kind of debug build and doesn't care too much about performance. So it would probably make sense to enable the fast hardening mode (at least) in that case. libstdc++ has done this: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112808 ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 122942] [mlir][presburger] "Isolated" local variables can produce crash or unexpected results in IntegerRelation/PresburgerRelation routines
Issue 122942 Summary [mlir][presburger] "Isolated" local variables can produce crash or unexpected results in IntegerRelation/PresburgerRelation routines Labels mlir:presburger Assignees christopherbate Reporter christopherbate I've noticed an issue where "isolated" local variables can cause problems in the elimination of non-div locals (IntegerRelation -> PresburgerRelation) as well as in the set subtraction algorithm (`getSetDifference` in `PresburgerRelation.cpp`). By "isolated" I mean that there is a local defined in the IntegerRelation system, but there may not be a constraint that links the local to other variables. This sounds contrived, but it can happen if a user constructs an IntegerRelation from an AffineMap where one of the domain variables is not used to calculate the range. Invoking `IntegerRelation::getRangeSet` would then turn all the domain variables into locals, resulting in a local which is not related to any other variable in the system. If you then try to compute the "non-div local" representation, I noticed that the number of disjuncts can become much larger than if the local is eliminated (e.g. 2x for each unused local). Furthermore, I will sporadically run into an unreachable here https://github.com/llvm/llvm-project/blob/main/mlir/lib/Analysis/Presburger/Simplex.cpp#L1463. Will update this issue with a concise reproducer. As a workaround, I added a routine to eliminate such locals whenever I construct an IntegerRelation. It doesn't look like any of the existing simplification routines in `IntegerRelation` know how to remove with such locals. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 122967] Request Commit Access For jadhbeika
Issue 122967 Summary Request Commit Access For jadhbeika Labels infra:commit-access-request Assignees Reporter jadhbeika ### Why Are you requesting commit access ? I would like to implement the new version of the "nowait" clause of openmp which is now takes an optional argument of type OpenMP logical type _expression_ ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 122970] [libc++] Provide an observe semantic in the hardening mode
Issue 122970 Summary [libc++] Provide an observe semantic in the hardening mode Labels libc++, hardening Assignees Reporter ldionne Many people would benefit from having a way to turn on hardening without crashing their application when they violate a precondition. Instead, they would want to log the failure and continue. That way, they can enable hardening and gradually fix issues that come up in production, and eventually flip the switch without endangering their stability in production. This is akin to the observe semantic in Contracts, so we need something like that eventually anyway. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 122996] [MLIR][LLVM] Incorrect #llvm.constant_range import
Issue 122996 Summary [MLIR][LLVM] Incorrect #llvm.constant_range import Labels mlir Assignees Reporter Kuree See godbolt: https://godbolt.org/z/x34YarsrE `range(i32 0, -2147483648)` is imported as `#llvm.constant_range`, but `-2147483648` is internally stored as a 40-bit integer, causing the verifier to fail: https://github.com/llvm/llvm-project/blob/ebef44067bd0a2cd776b8baea39cffa7f602ce7b/mlir/lib/Dialect/LLVMIR/IR/LLVMAttrs.cpp#L290-L293 ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 122974] [llvm-exegesis][RISCV] computeAliasingInstructions in SerialSnipperGenerate generates instructions that can't be assembled
Issue 122974 Summary [llvm-exegesis][RISCV] computeAliasingInstructions in SerialSnipperGenerate generates instructions that can't be assembled Labels backend:RISC-V, tools:llvm-exegesis Assignees Reporter topperc I tried to run through all RISC-V opcodes available on my SiFive P550 system using -opcode-index=-1. I got some crashes trying to assemble pseudo instructions. Should llvm-exegesis be filtering pseudos and custom insertion instructions in this function? CC: @boomanaiden154 @mshockwave ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 122978] [libc] Make malloc resistant to overflow
Issue 122978 Summary [libc] Make malloc resistant to overflow Labels libc Assignees mysterymath Reporter mysterymath The malloc implmentation in libc has been only spoaradically careful to prevent overflow, but it hasn't been systematically careful. It should be the case that no value provided to any surface area of the allocator (the allocation functions, `_end`, and `__llvm_libc_heap_limit`) can cause it to produce erroneous behavior due to overflow. Tests should be added for the various possible overflow corner cases, checks added to secure against this possibility, and any spurious checks removed. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 123006] [bolt] Report error on .so instrumentation: unsupported CFI opcode
Issue 123006 Summary [bolt] Report error on .so instrumentation: unsupported CFI opcode Labels BOLT Assignees Reporter whousemyname I want to use llvm-bolt to instrument a .so file on the android platform, but it reported the following error: ` BOLT-INFO: shared object or position-independent executable detected BOLT-INFO: Target architecture: aarch64 BOLT-INFO: BOLT version: 446a426436c0b7e457992981d3a1f2b4fda19992 BOLT-INFO: first alloc address is 0x0 BOLT-INFO: creating new program header table at address 0x1c0, offset 0x1c0 BOLT-WARNING: debug info will be stripped from the binary. Use -update-debug-sections to keep it. BOLT-INFO: enabling relocation mode BOLT-INFO: forcing -jump-tables=move for instrumentation unsupported CFI opcode UNREACHABLE executed at D:\github-projects\llvm-project\bolt\lib\Core\BinaryFunction.cpp:2591! PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace. Exception Code: 0x8003 unsupported CFI opcode UNREACHABLE executed at D:\github-projects\llvm-project\bolt\lib\Core\BinaryFunction.cpp:2591! ` command to execute:` \llvm-bolt.exe .\libzzz-debug.so -instrument -o .\libzzz-debug.so.instrumented` What is the reason for this phenomenon? Does it mean that bolt cannot instrument .so files, or the generation process of my .so files does not meet the requirements of instrumentation? ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 122971] wrong management of parameter substitution in a concept-id
Issue 122971 Summary wrong management of parameter substitution in a concept-id Labels new issue Assignees Reporter mrussoLuxoft In the code reported below: (https://godbolt.org/z/TYKhzqhEf) there are three couples of overloaded function templates, with different constraints. Let's say that in the couples of func1 and func2, the first overload is designed to work with std::set and the second with std::map. However, func1{}> leads to an error for Clang and gcc, because lambda expressions are not immediate context. This is correctly related to the following standard text (current draft): [expr.prim.req.general] - p5: "... can result in the formation of invalid types or expressions in the immediate context of its requirements ... In such cases, the requires-_expression_ evaluates to false; it does not cause the program to be ill-formed." [temp.constr.atomic] - p3: "To determine if an atomic constraint is satisfied, the parameter mapping and template arguments are first substituted into its _expression_. If substitution results in an invalid type or _expression_ in the immediate context of the atomic constraint, the constraint is not satisfied. ... " [temp.deduct.general] - p9: "When substituting into a lambda-_expression_, substitution into its body is not in the immediate context. ..." However, for func1b{}>, gcc and Clang behave differently. Indeed, Clang still considers the error in lambda _expression_ code, whereas gcc ignores it because this time the concept Cb does not use its parameters, relying on the following standard text: [temp.constr.normal] - p(1.4): "The normal form of a concept-id C is the normal form of the constraint-_expression_ of C, after substituting A1 , A2 , ..., An for C’s respective template parameters in the parameter mappings in each atomic constraint. ..." which means that the substitution leads to no error if a parameter is not used. I initially supposed this was an error for gcc, and posted a potential bug for gcc [here](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118398), but that discussion led to understand that's a problem of Clang. gcc guys seemed to remember a known problem of Clang, but I tried to search with keywords 'concept', 'parameter', and others, and I could not find it. Hope I am not duplicating. ``` #include #include #include template concept C = requires(Container c, KeyExtractor&& keyExtractor){ c.lower_bound(keyExtractor(c.begin())); }; template concept Cb = true; template requires C void func1([[maybe_unused]] const Container& c){ std::cout << "func1 first overload\n"; } template requires Cfirst;})> void func1([[maybe_unused]] const Container& c){ std::cout << "func1 second overload\n"; } template requires Cb void func1b([[maybe_unused]] const Container& c){ std::cout << "func1b first overload\n"; } template requires Cbfirst;})> void func1b([[maybe_unused]] const Container& c){ std::cout << "func1b second overload\n"; } template requires requires(Container c){*c.begin();} && C void func2([[maybe_unused]] const Container& c) { std::cout << "func2 first overload\n"; } template requires requires(Container c){c.begin()->first;} && Cfirst;})> void func2([[maybe_unused]] const Container& c) { std::cout << "func2 second overload\n"; } int main(){ func1(std::set{}); // - gcc and clang manage lambda as non-immediate // context, so getting a compilation error. // - MVSC rejects second overload and selects the // first one, that is, it considers failed // constraints due to unfair "it->first" code // in the lambda _expression_. func1(std::map{}); // all compilers correctly select second overload. // This time, no reverse problem about "*it", // and then concept C fails for set iterators. func1b(std::set{}); // - gcc considers both overloads as eligible, // ignoring the part "it->first" for second overload, // because it is not used in the concept definition. // - clang consistently behaves instead as for func1. // - MVSC rejects instead second overload, exactly as // for func1. //func1b(std::map{}); // all compilers correctly consider ambiguous overloads. func2(std::set{}); // all compilers select first overload (i.e., gcc and clang // behaves differently about lambda type resolution, because // the result of the normal form for the constraints, is
[llvm-bugs] [Bug 122948] clang-format doesn't handle the spaceship "<=>" symbol correctly
Issue 122948 Summary clang-format doesn't handle the spaceship "<=>" symbol correctly Labels clang-format Assignees Reporter chriskot870 I was referred here to post a potential bug in the clang-format application. I am running Ubuntu 24.04. The version of clang-format is: ``` $ clang-format --version Ubuntu clang-format version 18.1.3 (1ubuntu1) ``` I use this file: ``` #include #include int main() { int x = 5; int y = 6; if ((x <=> y) != 0) { printf("x is not equal too y\n"); } } ``` I can compile and run it. ``` $ g++ -std=c++23 spaceship_test.cpp $ ./a.out x is not equal too y ``` I then run clang-format and try to compile: ``` $ clang-format -i spaceship_test.cpp $ g++ -std=c++23 spaceship_test.cpp spaceship_test.cpp: In function ‘int main()’: spaceship_test.cpp:11:13: error: expected primary-_expression_ before ‘>’ token 11 | if ((x <= > y) != 0) { | ^ ``` Indeed in the file the <=> has been changed to "<= >" Notice the space symbol after the "=". ``` #include #include int main() { int x = 5; int y = 6; if ((x <= > y) != 0) { printf("x is not equal too y\n"); } } ``` Here is the clang-format file I am using [clang-format.txt](https://github.com/user-attachments/files/18414723/clang-format.txt). I had to add a .txt in order for the tool to accept the .clang-format file I assume this is some sort of bug in clang-format. Thanks Chris ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 122954] [TySan] False positive with global structs
Issue 122954 Summary [TySan] False positive with global structs Labels new issue Assignees Reporter TheLastRar This was previously reported in the discourse, since TySan has now been merged I'm reposting it on Github This is the reduced version from [gbMattN](https://discourse.llvm.org/t/reviving-typesanitizer-a-sanitizer-to-catch-type-based-aliasing-violations/66092/22), the original report used a `std::array` Tysan reports a type violation when arr is in the global scope https://godbolt.org/z/TxeaaY5cc ```C #include struct array_type{ int inner[1]; }; struct array_type arr; int main() { arr.inner[0] = 5; return 0; } ``` TySan currently reports ``` ==1==ERROR: TypeSanitizer: type-aliasing-violation on address 0x5d14c0846c5c (pc 0x5d14bfeefb4b bp 0x7ffd3e06b7a0 sp 0x7ffd3e06b730 tid 1) WRITE of size 4 at 0x5d14c0846c5c with type int accesses an existing object of type array_type #0 0x5d14bfeefb4a (/app/output.s+0x2ab4a) ``` ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 123001] Request Commit Access For sebpop
Issue 123001 Summary Request Commit Access For sebpop Labels infra:commit-access-request Assignees Reporter sebpop I am requesting commit access to be able to merge my own approved patches: https://github.com/llvm/llvm-project/pull/116628 https://github.com/llvm/llvm-project/pull/116631 I am fixing the data dependence analysis and working on the loop optimizers. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 123021] Failed to eliminate unreachable code
Issue 123021 Summary Failed to eliminate unreachable code Labels new issue Assignees Reporter CrazyboyQCD C: https://godbolt.org/z/8Y5eTreoE Rust: https://godbolt.org/z/vqv7MsnPK Under condition `x <= y` there should be no codegen for the `else` branch. ```c void f1(float x, float y, float* z) { if (x <= y) return; if (x > y) *z = 0.0; else *z = x - y; } ``` ```c void f2(float x, float y, float* z) { if (x > y) *z = 0.0; } ``` ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 122985] [clang-tidy] Check request: detect saving stack addresses beyond their lifetime
Issue 122985 Summary [clang-tidy] Check request: detect saving stack addresses beyond their lifetime Labels clang-tidy Assignees Reporter asund This seems to be missed by existing stack address check as the address doesn't escape scope of the stack but is preserved between scopes using a static variable. ``` auto f() { process stack_array[] = { method1, method2, method3 }; static *process process_to_use = nullptr; if (!process) { // some expensive init later... process = &stack_array[n]; } if (!process) { process->do_processing(); // segfault } } ``` process_to_use has a stale value when the function is called again. stack_array needs to have static lifetime in this case. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 122959] Clang failed 100 tests on llvmorg-19.1.7
Issue 122959 Summary Clang failed 100 tests on llvmorg-19.1.7 Labels clang Assignees Reporter leecommamichael **The windows binaries do not contain a target for wasm, so I decided to build it myself.** This is a fresh install of Windows 11 using Visual Studio 2022. I compiled with Cmake and Ninja as adviced on https://clang.llvm.org/get_started.html I ran `cmake -GNinja -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS=clang ..\llvm` then `ninja check` as advised on the "get started" page. ``` Testing Time: 893.79s Total Discovered Tests: 98695 Skipped :32 (0.03%) Unsupported : 2898 (2.94%) Passed : 95462 (96.72%) Expectedly Failed: 203 (0.21%) Failed : 100 (0.10%) ``` This is on tag `llvmorg-19.1.7` commit `cd708029e0b2869e80abe31ddb175f7c35361f90` Is this to be expected? ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 123023] Sanitizer test regressions with CLANG_CONFIG_FILE_SYSTEM_DIR building with GCC after #60394 fix
Issue 123023 Summary Sanitizer test regressions with CLANG_CONFIG_FILE_SYSTEM_DIR building with GCC after #60394 fix Labels new issue Assignees Reporter xry111 The fix for #60394 has made setting `CLANG_NO_DEFAULT_CONFIG=1` only when the build compiler supports `--no-default-config`, but when we build LLVM suite with GCC, the build compiler obviously does not support this option. Thus when clang is configured with `CLANG_CONFIG_FILE_SYSTEM_DIR` and the config file contains some compiler options, some tests start to fail: ``` Failed Tests (2): DataFlowSanitizer-x86_64 :: custom.cpp DataFlowSanitizer-x86_64 :: origin_unaligned_memtrans.c ``` ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs