[llvm-bugs] [Bug 120750] [libc++] Investigate the performance of lower_bound
Issue 120750 Summary [libc++] Investigate the performance of lower_bound Labels libc++, performance Assignees Reporter ldionne We seem to be doing consistently poorly. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 120763] [libc++] Investigate benchmark results for std::set_intersection
Issue 120763 Summary [libc++] Investigate benchmark results for std::set_intersection Labels libc++, performance Assignees Reporter ldionne We sometimes do really good and sometimes really bad. Also the benchmarks are impossible to understand at a glance, that should be fixed. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 120754] [libc++] Investigate the performance of std::map constructors and various operations
Issue 120754 Summary [libc++] Investigate the performance of std::map constructors and various operations Labels libc++, performance Assignees Reporter ldionne [benchmark.txt](https://github.com/user-attachments/files/18212597/benchmark.txt) ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 120756] [libc++] Optimize std::min / std::max / std::minmax for __int128
Issue 120756 Summary [libc++] Optimize std::min / std::max / std::minmax for __int128 Labels libc++, performance Assignees Reporter ldionne We seem to be missing a special case for `__int128` in these algorithms. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 120757] [libc++] Ensure that benchmarks are run with user-defined types
Issue 120757 Summary [libc++] Ensure that benchmarks are run with user-defined types Labels libc++, performance Assignees Reporter ldionne We have a lot of benchmarks that run with integral types and other fundamental types, but we usually don't run benchmarks with user-defined types. That means that we don't have coverage for the "most general" versions of our algorithms. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 120764] [VectorCombine] foldShuffleOfShuffles - failure to merge nested shuffles with common operands
Issue 120764 Summary [VectorCombine] foldShuffleOfShuffles - failure to merge nested shuffles with common operands Labels missed-optimization, llvm:transforms Assignees Reporter RKSimon ```ll define <4 x double> @add_v4f64_u123(<4 x double> %a, <4 x double> %b) { %1 = shufflevector <4 x double> %b, <4 x double> %a, <4 x i32> %2 = shufflevector <4 x double> %b, <4 x double> %a, <4 x i32> %3 = shufflevector <4 x double> %1, <4 x double> %b, <4 x i32> %4 = shufflevector <4 x double> %2, <4 x double> %b, <4 x i32> %result = fadd <4 x double> %3, %4 ret <4 x double> %result } ``` All shuffles only have 2 operands, so we should be able to fold to: ```ll define <4 x double> @add_v4f64_u123(<4 x double> %a, <4 x double> %b) { %1 = shufflevector <4 x double> %a, <4 x double> %b, <4 x i32> %2 = shufflevector <4 x double> %a, <4 x double> %b, <4 x i32> %result = fadd <4 x double> %1, %2 ret <4 x double> %result } ``` Pulled out of #34072 ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 120765] [libc++] Refactor the std::variant benchmarks to be actually relevant but less exhaustive
Issue 120765 Summary [libc++] Refactor the std::variant benchmarks to be actually relevant but less exhaustive Labels libc++, performance Assignees Reporter ldionne ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 120772] [VectorCombine] foldInsExtVectorToShuffle can't handle length changing shuffles
Issue 120772 Summary [VectorCombine] foldInsExtVectorToShuffle can't handle length changing shuffles Labels new issue Assignees Reporter RKSimon The VectorCombine::foldInsExtVectorToShuffle fold `insert (DstVec, (extract SrcVec, ExtIdx), InsIdx) --> shuffle (DstVec, SrcVec, Mask)` is limited to cases where the DstVec/SrcVec are the same vector type, but it might still be cost beneficial to fold these for non matching shuffles assuming any shuffle narrowing/widening for SrcVec is cheap enough. ```ll define <4 x double> @ins0_v4f64_ext1_v2f64(<4 x double> %a, <2 x double> %b) { %ext = extractelement <2 x double> %b, i32 1 %ins = insertelement <4 x double> %a, double %ext, i32 0 ret <4 x double> %ins } define <2 x double> @ins1_v2f64_ext1_v4f64(<2 x double> %a, <4 x double> %b) { %ext = extractelement <4 x double> %b, i32 1 %ins = insertelement <2 x double> %a, double %ext, i32 1 ret <2 x double> %ins } ``` InstCombine will handle some 'easy' cases, but ideally we need VectorCombine to handle this more generally as a cost driven combine. CC @ParkHanbum who handled something similar recently for #115209 ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 120758] [libc++] MinMaxElement benchmark is running on a lot of unnecessary patterns
Issue 120758 Summary [libc++] MinMaxElement benchmark is running on a lot of unnecessary patterns Labels libc++, performance Assignees Reporter ldionne The benchmark is being run on PipeOrgan, QuickSortAdversary, etc patterns. That's not useful for that algorithm, we should rewrite that benchmark. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 120759] Add support for `/Zc:enumTypes` compiler flag in MSVC mode
Issue 120759 Summary Add support for `/Zc:enumTypes` compiler flag in MSVC mode Labels new issue Assignees Reporter Saalvage MSVC used to have a [bug](https://developercommunity.visualstudio.com/t/underlying-type-of-an-unscoped-enum/524018) where enums always had the implicit underlying type of `int` , independently of the enum's items. This bug is [accomodated for](https://github.com/llvm/llvm-project/blob/main/clang/lib/Sema/SemaDecl.cpp#L17268-L17275) within clang's MSVC compatibility mode, while the fix for it (the `/Zc:enumTypes` flag) is not. The issue with this is that this results in information being lost for the enum's items (as they are trimmed to fit the underlying type), so fixing this on the application side would likely require reparsing the items. I believe this might potentially be a good fit for an option in `LangOptions.def`? ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 120762] [libc++] Investigate the performance of `std::pop_heap` on unsigned integers
Issue 120762 Summary [libc++] Investigate the performance of `std::pop_heap` on unsigned integers Labels libc++, performance Assignees Reporter ldionne We seem to be doing consistently not great (but that's not the case for other types). ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 120775] [VectorCombine] Handle shuffle of selects
Issue 120775 Summary [VectorCombine] Handle shuffle of selects Labels missed-optimization, llvm:transforms Assignees Reporter RKSimon VectorCombine can't currently fold `shuffle(select(c0,t0,f0), select(c1,t1,f1)) -> select(shuffle(c0,c1),shuffle(t0,t1),shuffle(f0,f1))` - although foldShuffleToIdentity can fold some very basic cases. This is trickier than foldShuffleOfBinops etc. as the default increase in instruction count will likely prevent basic fold from succeeding (from a cost:benefit), so we will need to just handle cases where at least one of the folded inner shuffles gets simplified away. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 120724] [flang][openmp] crash in complex atomic
Issue 120724 Summary [flang][openmp] crash in complex atomic Labels flang Assignees Reporter jeanPerier ``` complex c integer :: i c=(1.0,1.0) call omp_set_num_threads(4) !$omp parallel do do i = 1, 1 !$omp atomic c = c + (1.0,1.0) end do print *, c end ``` Segfaults in __atomic_compare_exchange ``` #0 0x77fa46c7 in __atomic_compare_exchange () from /lib/x86_64-linux-gnu/libatomic.so.1 #1 0x5418 in _QQmain..omp_par () at repro.f90:7 #2 0x762c3239 in __kmp_invoke_microtask () from lib/libomp.so #3 0x7624041f in __kmp_invoke_task_func () from lib/libomp.so #4 0x7623f07e in __kmp_launch_thread () from lib/libomp.so #5 0x762a2f18 in __kmp_launch_worker(void*) () from lib/libomp.so #6 0x75e94ac3 in start_thread (arg=) at ./nptl/pthread_create.c:442 #7 0x75f26660 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 If I lower the loop iteration from 1 to 100, I do not see such crash and the expected number is printed. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 120722] [Clang] __attribute__((nodebug)) is silently dropped on `using` aliases in some locations
Issue 120722 Summary [Clang] __attribute__((nodebug)) is silently dropped on `using` aliases in some locations Labels clang:frontend Assignees Reporter philnik777 ```c++ using type1 = __attribute__((nodebug)) int; using type2 = int __attribute__((nodebug)); using type3 __attribute__((nodebug)) = int; ``` results in this AST: ``` TranslationUnitDecl |-TypeAliasDecl col:7 type1 'int' | `-BuiltinType 'int' |-TypeAliasDecl col:7 type2 'int' | `-BuiltinType 'int' `-TypeAliasDecl col:7 type3 'int' |-BuiltinType 'int' `-NoDebugAttr ``` Note that `type1` and `type2` don't have the `NoDebugAttr`, but clang accepts the code above without any diagnostic. https://godbolt.org/z/Mqfe13jo1 ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 120696] DSE removes store to an alloca that is passed to a byval parameter
Issue 120696 Summary DSE removes store to an alloca that is passed to a byval parameter Labels new issue Assignees Reporter rofirrim The following IR, which I think is correct (but I admit I'm not 100% sure due to the constraints of `tail` calls though I think the `byval` makes it correct?) ```llvm %_QMmooTtoken_t = type { [64 x i32] } define void @_QMmooPsub(ptr nocapture %0) #0 { %2 = alloca %_QMmooTtoken_t, i64 1, align 8 %3 = load %_QMmooTtoken_t, ptr %0, align 4 store %_QMmooTtoken_t %3, ptr %2, align 4 tail call void @foo(ptr %2) ret void } declare void @foo(ptr byval(%_QMmooTtoken_t) align 4) local_unnamed_addr ``` is simplified by DSE (`opt -passes=dse`) like so: ```llvm %_QMmooTtoken_t = type { [64 x i32] } define void @_QMmooPsub(ptr nocapture %0) { %2 = alloca %_QMmooTtoken_t, i64 1, align 8 tail call void @foo(ptr %2) ret void } declare void @foo(ptr byval(%_QMmooTtoken_t) align 4) local_unnamed_addr ``` which I don't think preserves the original meaning because we're now passing a copy (`byval`) of an uninitialised value into `@foo` rather than the value pointed by `%0`. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 120700] [PowerPC] Parameters are not stored in stack as per ABI
Issue 120700 Summary [PowerPC] Parameters are not stored in stack as per ABI Labels new issue Assignees Reporter IshwaraK For the below example, we can observe that the function parameters are not stored in stack slot as per ABI, instead being stored in other area. The example is available here https://godbolt.org/z/Gfevrzh93 ``` #include int __attribute__((noinline)) foo(long* y) { long **x = &y - 6; // needs to be &y+2 in clang, but as per ABI that is not recorded printf("return address: %p\n", *(x+2)); return **x; } /* int __attribute__((noinline)) foo(long* y) { long **x = __builtin_frame_address(1); printf("return address: %p\n", *(x+2)); return **x; } */ int __attribute__((noinline)) bar() { long y = 200; return 1 + foo(&y); } int __attribute__((noinline)) xyz() { return 2 + bar(); } int main() { return xyz(); } ``` The below is the explanation of the discrepancies. ``` // PowerPC gcc compiler at -O0 .L.foo(long*): mflr 0 std 0,16(1) std 31,-8(1) stdu 1,-144(1) mr 31,1 std 3,192(31) <--- Right: Stored at parameter area of the caller addi 9,31,144 <--- Got the frame pointer std 9,112(31) ld 9,112(31) addi 9,9,16 ld 9,0(9) <--- got the return address mr 4,9 addis 3,2,.LC0@toc@ha addi 3,3,.LC0@toc@l bl printf // PowerPc clang Compiler at -O0 .Lfunc_begin0: mflr 0 std 31, -8(1) stdu 1, -144(1) std 0, 160(1) mr 31, 1 std 3, 128(31) <--- Wrong: Stored in SAVE reg area instead of caller parameter area addi 3, 31, 128 addi 3, 3, -48 std 3, 120(31) ld 3, 120(31) ld 4, 16(3) addis 3, 2, .L.str@toc@ha addi 3, 3, .L.str@toc@l bl printf ``` ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 120709] [libc] Suspected strict aliasing violation in string_length_wide_read
Issue 120709 Summary [libc] Suspected strict aliasing violation in string_length_wide_read Labels libc Assignees Reporter Voultapher The following code in libc: https://github.com/llvm/llvm-project/blob/b41240be6b9e58687011b2bd1b942c6625cbb5ad/libc/src/string/string_utils.h#L65 invokes what I understand to be strict aliasing UB. I think block reads would have to go through `memcpy`/`__builtin_memcpy` which can then be optimized by the compiler as appropriate. Tagging @michaelrj-google. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 120708] [flang][openmp] incorrect firstprivate initialization in omp task
Issue 120708 Summary [flang][openmp] incorrect firstprivate initialization in omp task Labels bug, flang:openmp Assignees Reporter jeanPerier It looks like firstprivate initialization is not happening as expected in the program below: `i` in the `!omp task` is not given the value of `i` when the task is created (it seems it is given the value of `i` when the task is launched because it tends to be given `5` for instance). I expect the program below to print "in task: i=..." to print 1, 2,3 and 4 (in any order), but I see 5,5,5,5 (or sometimes 2,3,4,5). subroutine test() use omp_lib, only : omp_get_thread_num implicit none integer :: i !$omp parallel private(i) !$omp single do i = 1,4 print '("outside: i= ",I0," thread_id= ",I0, " &i= "I0)', i, omp_get_thread_num(), loc(i) !$omp task firstprivate(i) print '("in task: i= ",I0," thread_id= ",I0, " &i= "I0)', i, omp_get_thread_num(), loc(i) !$omp end task end do !$omp end single !$omp end parallel end subroutine use omp_lib, only : omp_set_dynamic, omp_set_num_threads call omp_set_dynamic(.false.) call omp_set_num_threads(4) call test() end All of gfortran, nvfortran and ifx print 1,2,3,4 here (in random order of course). ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 120812] GH58962 regression: Clang rejects &-qualified member function to be overloaded with const-qualified with mutual exclusive constraints
Issue 120812 Summary GH58962 regression: Clang rejects &-qualified member function to be overloaded with const-qualified with mutual exclusive constraints Labels clang Assignees Reporter zhihaoy Worked in: Clang 17 Stop working since: Clang 18 ### Symptom Clang 18.1.0 and trunk reject the following program in `-std=c++20`, but accept it in `-std=c++23`. Clang 17.1.0 accepts it in `-std=c++20`, so does GCC and MSVC. ```cpp namespace GH58962_alt { template struct type { void func() const requires(R == 0); void func() & requires(R == 1); }; template concept test = requires(T v) { decltype(v)(v).func(); }; static_assert(test &>); static_assert(test &&>); static_assert(test &>); static_assert(not test &&>); static_assert(test const &>); static_assert(test const &&>); static_assert(not test const &>); static_assert(not test const &&>); } // namespace GH58962_alt ``` https://godbolt.org/z/vY63TE43K ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 120813] [Clang][OpenMP] Can't emit LLVM from a C file with OpenMP
Issue 120813 Summary [Clang][OpenMP] Can't emit LLVM from a C file with OpenMP Labels clang Assignees Reporter wesuRage My test: ```bash $ clang -fopenmp -S -emit-llvm a.c -o a.ll a.c:1:10: fatal error: 'omp.h' file not found 1 | #include | ^~~ ``` Looking for this header with the `find` command, it lays on the gcc includes path, which is not compatible with clang. And I've tried to install every possible variation of the `libomp` and it still doesn't work. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 120782] perf2bolt crashes using Amazon Linux 2 on AArch64
Issue 120782 Summary perf2bolt crashes using Amazon Linux 2 on AArch64 Labels new issue Assignees Reporter salvatoredipietro ### Description When attempting to use BOLT on Amazon Linux 2 (AL2) using an AArch64 instance (AWS m7g.4xlarge), the `perf2bolt` command fails with an assertion error. ### Environment - Operating System: Amazon Linux 2 (kernel: 5.10.228-219.884.amzn2.aarch64) - Hardware: AWS m7g.4xlarge instance (ARM-based) - BOLT version: d33a2c58112bdd74225b0ff4f07acc49bed7e6ea - Application used: PostgreSQL (build locally) ### Steps to Reproduce 1. Build BOLT on Amazon Linux 2 ```bash # BOLT Installation on AL2 sudo yum install -y gcc10 gcc10-* export CC=`which gcc10-gcc` export CXX=`which gcc10-g++` export LD=`which gcc10-gcc` export CMAKE_CXX_COMPILER=`which gcc10-g++` export CMAKE_C_COMPILER=`which gcc10-gcc` # CMAKE curl -L "https://github.com/Kitware/CMake/releases/download/v3.31.2/cmake-3.31.2-linux-`uname -p`.tar.gz" | tar xzC /opt && mv /opt/cmake-3.31.2-linux-`uname -p` /opt/cmake3 source <(echo 'export PATH="/opt/cmake3/bin:$PATH"' | tee -a ~/.bashrc) && cmake –version # NINJA curl -L https://github.com/ninja-build/ninja/releases/download/v1.12.1/ninja-linux$([ `uname -m` == aarch64 ] && echo -`uname -m`).zip -o /opt/ninja-linux.zip && unzip /opt/ninja-linux.zip -d /opt source <(echo 'export PATH="/opt/:$PATH"' | tee -a ~/.bashrc) && ninja --version # BOLT sudo yum install -y perf git clone https://github.com/llvm/llvm-project.git mkdir build && cd build cmake -G Ninja ../llvm-project/llvm -DLLVM_TARGETS_TO_BUILD="X86;AArch64" -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_ENABLE_PROJECTS="bolt" ninja bolt ``` 2. Run the following command: ```bash # Get perf information sudo perf record -e cycles:u -u postgres -o perf.data -a -- sleep 300 # Run perf2bolt cmd sudo ~/build/bin/perf2bolt -p perf.data -o perf.boltdata --nl /path/to/postgres ``` ### Error Message ```bash sudo ~/build/bin/perf2bolt -p perf.data -o perf.boltdata --nl /home/ec2-user/usr/bin/postgres PERF2BOLT: Starting data aggregation job for perf.data PERF2BOLT: spawning perf job to read events without LBR PERF2BOLT: spawning perf job to read mem events PERF2BOLT: spawning perf job to read process events PERF2BOLT: spawning perf job to read task events BOLT-INFO: Target architecture: aarch64 BOLT-INFO: BOLT version: d33a2c58112bdd74225b0ff4f07acc49bed7e6ea BOLT-INFO: first alloc address is 0x40 BOLT-INFO: creating new program header table at address 0xe0, offset 0xa0 BOLT-INFO: enabling relocation mode BOLT-INFO: enabling strict relocation mode for aggregation purposes BOLT-WARNING: ignoring symbol __bss_start at 0xc88a5a, which lies outside .bss BOLT-WARNING: ignoring symbol __bss_start__ at 0xc88a5a, which lies outside .bss BOLT-INFO: pre-processing profile using perf data aggregator BOLT-INFO: binary build-id is: cced2a14ed88302024b2d4d7e74d98246ba9a86a PERF2BOLT: spawning perf job to read buildid list PERF2BOLT: matched build-id and file name PERF2BOLT: waiting for perf mmap events collection to finish... PERF2BOLT: parsing perf-script mmap events output PERF2BOLT: waiting for perf task events collection to finish... PERF2BOLT: parsing perf-script task events output PERF2BOLT: input binary is associated with 11987 PID(s) PERF2BOLT: waiting for perf events collection to finish... PERF2BOLT: parsing basic events (without LBR)... PERF2BOLT: waiting for perf mem events collection to finish... perf2bolt: /home/ec2-user/llvm-project/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp:785: bool {anonymous}::AArch64MCPlusBuilder::analyzeIndirectBranchFragment(const llvm::MCInst&, llvm::DenseMap >&, const llvm::MCExpr*&, int64_t&, int64_t&, llvm::MCInst*&) const: Assertion `DefJTBaseAdd->getOpcode() == AArch64::ADDXri && "Failed to match jump table base address pattern! (1)"' failed. #0 0x00e40df0 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (.localalias) (/home/ec2-user/build/bin/perf2bolt+0xe40df0) #1 0x00e3e89c SignalHandler(int) (/home/ec2-user/build/bin/perf2bolt+0xe3e89c) #2 0x96ce8850 (linux-vdso.so.1+0x850) #3 0x96807834 raise (/lib64/libc.so.6+0x32834) #4 0x96809140 abort (/lib64/libc.so.6+0x34140) #5 0x96800780 __assert_fail_base (/lib64/libc.so.6+0x2b780) #6 0x968007fc (/lib64/libc.so.6+0x2b7fc) #7 0x01584ecc (anonymous namespace)::AArch64MCPlusBuilder::analyzeIndirectBranch(llvm::MCInst&, llvm::bolt::MCPlusBuilder::InstructionIterator, llvm::bolt::MCPlusBuilder::InstructionIterator, unsigned int, llvm::MCInst*&, unsigned int&, unsigned int&, long&, llvm::MCExpr const*&, llvm::MCInst*&, llvm::MCInst*&) const (/home/ec2-user/build/bin/perf2bolt+0x1584ecc) #8 0x015e0
[llvm-bugs] [Bug 120792] [clang-tidy] Clang tidy failed 6.3.0
Issue 120792 Summary [clang-tidy] Clang tidy failed 6.3.0 Labels clang, clang-tidy Assignees Reporter RandUser123sa I'm receive the following error when trying to compile migraphx: 67%] Building CXX object src/onnx/CMakeFiles/migraphx_onnx.dir/parse_instancenorm.cpp.o clang-tidy: /mnt/arch/rocm/release/llvm-project-rocm-6.3.0/clang/lib/AST/ExprConstant.cpp:15679: bool clang::Expr::EvaluateAsInt(EvalResult&, const clang::ASTContext&, SideEffectsKind, bool) const: Assertion `!isValueDependent() && "_expression_ evaluator can't be called on a dependent _expression_."' failed. PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace. Stack dump: 0. Program arguments: /opt/rocm/llvm/bin/clang-tidy --use-color --config-file=/mnt/arch/rocm/release/AMDMIGraphX-rocm-6.3.0/.clang-tidy -quiet -p /mnt/arch/rocm/rocm-build/build/amdmigraphx -checks=boost-*,bugprone-*,cert-*,clang-analyzer-*,clang-diagnostic-*,cppcoreguidelines-*,google-*,hicpp-multiway-paths-covered,hicpp-signed-bitwise,llvm-namespace-comment,misc-*,-misc-confusable-identifiers,-misc-use-anonymous-namespace,modernize-*,performance-*,readability-*,-bugprone-easily-swappable-parameters,-bugprone-implicit-widening-of-multiplication-result,-bugprone-macro-parentheses,-bugprone-multi-level-implicit-pointer-conversion,-bugprone-signed-char-misuse,-bugprone-unchecked-optional-access,-bugprone-unused-local-non-trivial-variable,-cert-dcl37-c,-cert-dcl51-cpp,-cert-err33-c,-cert-str34-c,-cert-msc32-c,-cert-msc51-cpp,-clang-analyzer-alpha*,clang-analyzer-alpha.core.CallAndMessageUnInitRefArg,clang-analyzer-alpha.core.Conversion,clang-analyzer-alpha.core.IdenticalExpr,clang-analyzer-alpha.core.PointerArithm,clang-analyzer-alpha.core.PointerSub,clang-analyzer-alpha.core.TestAfterDivZero,clang-analyzer-alpha.cplusplus.InvalidIterator,clang-analyzer-alpha.cplusplus.IteratorRange,clang-analyzer-alpha.cplusplus.MismatchedIterator,clang-analyzer-alpha.cplusplus.MisusedMovedObject,-bugprone-switch-missing-default-case,-bugprone-empty-catch,-clang-analyzer-optin.performance.Padding,-clang-diagnostic-deprecated-declarations,-clang-diagnostic-disabled-macro-expansion,-clang-diagnostic-extern-c-compat,-clang-diagnostic-unused-command-line-argument,-cppcoreguidelines-avoid-capture-default-when-capturing-this,-cppcoreguidelines-avoid-const-or-ref-data-members,-cppcoreguidelines-avoid-do-while,-cppcoreguidelines-explicit-virtual-functions,-cppcoreguidelines-init-variables,-cppcoreguidelines-misleading-capture-default-by-value,-cppcoreguidelines-missing-std-forward,-cppcoreguidelines-pro-bounds-array-to-pointer-decay,-cppcoreguidelines-pro-bounds-constant-array-index,-cppcoreguidelines-pro-bounds-pointer-arithmetic,-cppcoreguidelines-pro-type-member-init,-cppcoreguidelines-pro-type-reinterpret-cast,-cppcoreguidelines-pro-type-union-access,-cppcoreguidelines-pro-type-vararg,-cppcoreguidelines-rvalue-reference-param-not-moved,-cppcoreguidelines-special-member-functions,-cppcoreguidelines-use-default-member-init,-cppcoreguidelines-virtual-class-destructor,-google-readability-*,-google-runtime-int,-google-runtime-references,-misc-include-cleaner,-misc-macro-parentheses,-misc-no-recursion,-modernize-concat-nested-namespaces,-modernize-pass-by-value,-modernize-type-traits,-modernize-use-default-member-init,-modernize-use-nodiscard,-modernize-use-override,-modernize-use-trailing-return-type,-modernize-use-transparent-functors,-performance-avoid-endl,-performance-type-promotion-in-math-fn,-performance-enum-size,-readability-braces-around-statements,-readability-avoid-nested-conditional-operator,-readability-convert-member-functions-to-static,-readability-else-after-return,-readability-function-cognitive-complexity,-readability-identifier-length,-readability-named-parameter,-readability-redundant-member-init,-readability-redundant-string-init,-readability-suspicious-call-argument,-readability-uppercase-literal-suffix,-*-avoid-c-arrays,-*-explicit-constructor,-*-magic-numbers,-*-narrowing-conversions,-*-non-private-member-variables-in-classes,-*-use-auto,-*-use-emplace,-*-use-equals-default -warnings-as-errors= -extra-arg=-UNDEBUG -extra-arg=-DMIGRAPHX_USE_CLANG_TIDY -extra-arg=-Xclang -extra-arg=-analyzer-max-loop -extra-arg=-Xclang -extra-arg=10 -extra-arg=-Xclang -extra-arg=-analyzer-inline-max-stack-depth -extra-arg=-Xclang -extra-arg=10 -extra-arg=-Xclang -extra-arg=-analyzer-config -extra-arg=-Xclang -extra-arg=optin.cplusplus.UninitializedObject:Pedantic=true -extra-arg=-Xclang -extra-arg=-analyzer-config -extra-arg=-Xclang -extra-arg=widen-loops=true -extra-arg=-Xclang -extra-arg=-analyzer-config -extra-arg=-Xclang -extra-arg=unroll-loops=true -extra-arg=-Xclang -extra-arg=-analyzer-config -extra-a
[llvm-bugs] [Bug 120802] Clang crashes when using lambda captures inside a trailing return decltype
Issue 120802 Summary Clang crashes when using lambda captures inside a trailing return decltype Labels clang, clang:frontend Assignees shafik Reporter fahadnayyar Clang crashed on this example: ``` template consteval auto matches(T t) { return [](auto u) -> decltype([u](){}()){}(t); } int main() { matches(0); return 0; } ``` This issue was introduced from clang16.0.0 ( released on March 17, 2023) to clang 17.0.1 (released on September 9, 2023) https://godbolt.org/z/q9v9q31f4 Gcc works fine for this example: https://godbolt.org/z/x8z41EKr4 ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 120736] [libc] `ninja check-libc`: AttributeError: module 'yaml' has no attribute 'safe_load'
Issue 120736 Summary [libc] `ninja check-libc`: AttributeError: module 'yaml' has no attribute 'safe_load' Labels libc Assignees Reporter vinay-deshmukh I was following instructions on: https://libc.llvm.org/full_host_build.html#id3 (with slightly modified cmake command, i.e. without Sphinx): ```sh cmake \ -B ~/build-folder \ -S runtimes \ -G Ninja \ -DCMAKE_C_COMPILER=clang \ -DCMAKE_CXX_COMPILER=clang++ \ -DLLVM_ENABLE_RUNTIMES="libc;compiler-rt" \ -DLLVM_LIBC_FULL_BUILD=ON \ -DCMAKE_BUILD_TYPE=Debug \ -DLLVM_LIBC_INCLUDE_SCUDO=ON \ -DCOMPILER_RT_BUILD_SCUDO_STANDALONE_WITH_LLVM_LIBC=ON \ -DCOMPILER_RT_BUILD_GWP_ASAN=OFF \ -DCOMPILER_RT_SCUDO_STANDALONE_BUILD_SHARED=OFF\ -DCMAKE_EXPORT_COMPILE_COMMANDS=ON \ -DLIBC_CMAKE_VERBOSE_LOGGING=ON ``` And then: ```sh ninja -C ~/build-folder check-libc ``` Which fails with: ``` Traceback (most recent call last): File "~/llvm-project/libc/hdrgen/yaml_to_classes.py", line 284, in main() File "~/llvm-project/libc/hdrgen/yaml_to_classes.py", line 261, in main header = load_yaml_file(args.yaml_file, header_class, args.entry_points) File "~/llvm-project/libc/hdrgen/yaml_to_classes.py", line 123, in load_yaml_file yaml_data = yaml.safe_load(f) AttributeError: module 'yaml' has no attribute 'safe_load' ninja: build stopped: subcommand failed. ``` Near: https://github.com/llvm/llvm-project/blob/5f0db7c11264fa235d73730b2b93a31407dfbef3/libc/hdrgen/yaml_to_classes.py#L120-L121 When I add the following code, just before the `yaml.safe_load`, ```py print(dir(yaml)) print((yaml).__path__) ``` I get the output: ``` ['__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__'] _NamespacePath(['~/llvm-project/libc/hdrgen/yaml']) ``` Which probably means, that `import yaml` is trying to "import" the https://github.com/llvm/llvm-project/tree/main/libc/hdrgen/yaml directory instead of `PyYaml` However, given that Note: I do have `PyYaml` installed within a conda environment which is activated, so I don't think this is an issue with my python setup: For instance, the following works as expected: ```sh ❯ /usr/bin/env python3 Python 3.13.1 | packaged by Anaconda, Inc. | (main, Dec 11 2024, 10:35:08) [Clang 14.0.6 ] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import yaml >>> 'safe_load' in dir(yaml) True ``` Given that the CI has been passing for months, it seems that I might be missing some sort of extra setup so that `/usr/bin/env python3` looks at the right path to import `yaml` (which might need to be added to https://libc.llvm.org/full_host_build.html#configure-the-build-for-development) Also looked at: https://libc.llvm.org/dev/header_generation.html#common-errors, but this page didn't seem to mention my issue ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 120733] [mlir] tiling: Invalid slice when #map(0) != 0
Issue 120733 Summary [mlir] tiling: Invalid slice when #map(0) != 0 Labels mlir:linalg, mlir Assignees Reporter mgehre-amd In the [reproducer](https://godbolt.org/z/fhen1svYd), ``` func.func @test(%arg0 : tensor<9xf32>) -> tensor<6xf32> { %empty = tensor.empty() : tensor<6xf32> %generic = linalg.generic {indexing_maps = [affine_map<(d0) -> (d0 + 3)>, affine_map<(d0) -> (d0)>], iterator_types = ["parallel"]} ins(%arg0: tensor<9xf32>) outs(%empty : tensor<6xf32>) { ^bb0(%in : f32, %out: f32): linalg.yield %in : f32 } -> tensor<6xf32> return %generic : tensor<6xf32> } module attributes {transform.with_named_sequence} { transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) { %0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op %1, %loop = transform.structured.tile_using_for %0 tile_sizes [3] : (!transform.any_op) -> (!transform.any_op, !transform.any_op) transform.yield } } ``` becomes ``` #map = affine_map<(d0) -> (d0 + 3)> #map1 = affine_map<(d0) -> (d0)> module { func.func @test(%arg0: tensor<9xf32>) -> tensor<6xf32> { %0 = tensor.empty() : tensor<6xf32> %c0 = arith.constant 0 : index %c6 = arith.constant 6 : index %c3 = arith.constant 3 : index %1 = scf.for %arg1 = %c0 to %c6 step %c3 iter_args(%arg2 = %0) -> (tensor<6xf32>) { %2 = affine.apply #map(%arg1) %extracted_slice = tensor.extract_slice %arg0[%2] [6] [1] : tensor<9xf32> to tensor<6xf32> %extracted_slice_0 = tensor.extract_slice %arg2[%arg1] [3] [1] : tensor<6xf32> to tensor<3xf32> %3 = linalg.generic {indexing_maps = [#map, #map1], iterator_types = ["parallel"]} ins(%extracted_slice : tensor<6xf32>) outs(%extracted_slice_0 : tensor<3xf32>) { ^bb0(%in: f32, %out: f32): linalg.yield %in : f32 } -> tensor<3xf32> %inserted_slice = tensor.insert_slice %3 into %arg2[%arg1] [3] [1] : tensor<3xf32> into tensor<6xf32> scf.yield %inserted_slice : tensor<6xf32> } return %1 : tensor<6xf32> } ``` This accesses out-of-bounds. `%2 = affine.apply #map(%arg1)` is `%arg1 + 3` (which takes the values `3` and `6` for the two loop iterations` and `%extracted_slice = tensor.extract_slice %arg0[%2] [6] [1] : tensor<9xf32> to tensor<6xf32>` will extract `6` elements from that offset, which tries to extract elements `6` to `12` in the second iteration - but the tensor only has 9 elements. It seems that the implemenation computing the slices is only correct when `#map(0)=0`. The correct offset for the extract slice would be `#map(%arg1) - #map(0)`, which here is `%arg1`. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 120745] [libc++] Investigate the performance of the `deque_vector_copy_backward` benchmark
Issue 120745 Summary [libc++] Investigate the performance of the `deque_vector_copy_backward` benchmark Labels libc++, performance Assignees Reporter ldionne ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 120748] [libc++] Investigate the ranges_fill_n benchmark
Issue 120748 Summary [libc++] Investigate the ranges_fill_n benchmark Labels libc++, performance Assignees Reporter ldionne Make sure it's properly written, since it is a bit too good to be true. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 120793] [clang-format] Crash in LeftRightQualifierAlignmentFixer for trailing return type
Issue 120793 Summary [clang-format] Crash in LeftRightQualifierAlignmentFixer for trailing return type Labels clang-format Assignees Reporter mellery451 source (main.cpp): ``` template inline auto clamp(bool& saturated, T const v, T const lo, T const hi) -> const T { if (v < lo) { saturated = true; return lo; } else if (v > hi) { saturated = true; return hi; } else { saturated = false; } return v; } ``` config file (format-config):\ ``` Language: Cpp Standard: c++14 BraceWrapping: AfterClass: true AfterCaseLabel: true AfterControlStatement: true AfterEnum: true AfterFunction: true AfterNamespace: true AfterStruct: true AfterUnion: true BeforeCatch: true BeforeElse: true IndentBraces: false BreakBeforeBraces: Custom QualifierAlignment: Custom QualifierOrder: ['static', 'inline', 'friend', 'constexpr', 'type', 'const', 'volatile', 'restrict'] ``` cmd: `clang-format --style="file:format-config" main.cpp` crash: ``` Stack dump: 0. Program arguments: clang-format --style=file:format-config2 main.cpp #0 0x7fd3ae89d370 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/linuxbrew/.linuxbrew/Cellar/llvm/19.1.6/bin/../lib/libLLVM.so.19.1+0xee0370) #1 0x7fd3ae89a2ce SignalHandler(int) Signals.cpp:0:0 #2 0x7fd3ad4cb520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520) #3 0x7fd3b7a6bd33 clang::format::LeftRightQualifierAlignmentFixer::analyzeRight(clang::SourceManager const&, clang::format::AdditionalKeywords const&, clang::tooling::Replacements&, clang::format::FormatToken const*, std::__cxx11::basic_string, std::allocator> const&, clang::tok::TokenKind) (.cold) QualifierAlignmentFixer.cpp:0:0 #4 0x7fd3bac52e12 clang::format::LeftRightQualifierAlignmentFixer::fixQualifierAlignment(llvm::SmallVectorImpl&, clang::format::FormatTokenLexer&, clang::tooling::Replacements&) (/home/linuxbrew/.linuxbrew/Cellar/llvm/19.1.6/bin/../lib/libclang-cpp.so.19.1+0x3d8ae12) #5 0x7fd3bac52e9a clang::format::LeftRightQualifierAlignmentFixer::analyze(clang::format::TokenAnnotator&, llvm::SmallVectorImpl&, clang::format::FormatTokenLexer&) (/home/linuxbrew/.linuxbrew/Cellar/llvm/19.1.6/bin/../lib/libclang-cpp.so.19.1+0x3d8ae9a) #6 0x7fd3bac65305 clang::format::TokenAnalyzer::process(bool) (/home/linuxbrew/.linuxbrew/Cellar/llvm/19.1.6/bin/../lib/libclang-cpp.so.19.1+0x3d9d305) #7 0x7fd3bac55156 clang::format::addQualifierAlignmentFixerPasses(clang::format::FormatStyle const&, llvm::SmallVectorImpl (clang::format::Environment const&)>>&)::'lambda0'(clang::format::Environment const&)::operator()(clang::format::Environment const&) const QualifierAlignmentFixer.cpp:0:0 #8 0x7fd3bac55b05 std::_Function_handler (clang::format::Environment const&), clang::format::addQualifierAlignmentFixerPasses(clang::format::FormatStyle const&, llvm::SmallVectorImpl (clang::format::Environment const&)>>&)::'lambda0'(clang::format::Environment const&)>::_M_invoke(std::_Any_data const&, clang::format::Environment const&) QualifierAlignmentFixer.cpp:0:0 #9 0x7fd3bac1f0e6 clang::format::internal::reformat(clang::format::FormatStyle const&, llvm::StringRef, llvm::ArrayRef, unsigned int, unsigned int, unsigned int, llvm::StringRef, clang::format::FormattingAttemptStatus*) (/home/linuxbrew/.linuxbrew/Cellar/llvm/19.1.6/bin/../lib/libclang-cpp.so.19.1+0x3d570e6) #10 0x7fd3bac20fa1 clang::format::reformat(clang::format::FormatStyle const&, llvm::StringRef, llvm::ArrayRef, llvm::StringRef, clang::format::FormattingAttemptStatus*) (/home/linuxbrew/.linuxbrew/Cellar/llvm/19.1.6/bin/../lib/libclang-cpp.so.19.1+0x3d58fa1) #11 0x55956f4ca402 clang::format::format(llvm::StringRef, bool) ClangFormat.cpp:0:0 #12 0x55956f4c15a0 main (/home/linuxbrew/.linuxbrew/Cellar/llvm/19.1.6/bin/clang-format+0xc5a0) #13 0x7fd3ad4b2d90 __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16 #14 0x7fd3ad4b2e40 call_init ./csu/../csu/libc-start.c:128:20 #15 0x7fd3ad4b2e40 __libc_start_main ./csu/../csu/libc-start.c:379:5 #16 0x55956f4c2715 _start (/home/linuxbrew/.linuxbrew/Cellar/llvm/19.1.6/bin/clang-format+0xd715) Segmentation fault (core dumped) ``` version: ``` $ clang-format --version Homebrew clang-format version 19.1.6 ``` ..but I have also repro'd with a non-homebrew version as well (v19 still) seems to be associated with the trailing return type, but I didn't experiment much. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 120788] [libc++] Don't use std::stable_sort in flat_map
Issue 120788 Summary [libc++] Don't use std::stable_sort in flat_map Labels libc++ Assignees huixie90 Reporter ldionne In `std::flat_map::insert(first, last)`, we currently use `std::stable_sort`. While discussing this with @huixie90 and reading https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2767r2.html#stable-sorting, we agreed that 1. The tree-based associative containers specify that `insert(first, last)` inserts in an unspecified order 2. Not using `std::stable_sort` inside `flat_map` is consistent with that 3. MSVC doesn't use `std::stable_sort` in its implementation 4. Using `std::sort` is usually more efficient than `std::stable_sort` Since the Standard doesn't require us to sort stably and doing so is more expensive, we should switch to just `std::sort`. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 120790] [libc++] std::pmr::polymorphic_allocator is declared in all language modes
Issue 120790 Summary [libc++] std::pmr::polymorphic_allocator is declared in all language modes Labels libc++ Assignees Reporter philnik777 `__fwd/memory_resource.h` doesn't hide the declaration properly. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 120820] Mac clang-format binary has a homebrew dependency
Issue 120820 Summary Mac clang-format binary has a homebrew dependency Labels clang-format Assignees Reporter bhelyer Hi there. I don't use [Homebrew](https://brew.sh/) on my systems. I tried to acquire clang-format from the LLVM 19.1 release on GitHub. Sadly, the `clang-format` (Along with every other executable) has a dependency to homebrew that isn't mentioned anywhere: ``` % otool -L ~/Downloads/LLVM-19.1.0-macOS-ARM64/bin/clang-format /Users/bernard/Downloads/LLVM-19.1.0-macOS-ARM64/bin/clang-format: /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1345.120.2) /usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.12) /opt/homebrew/opt/zstd/lib/libzstd.1.dylib (compatibility version 1.0.0, current version 1.5.6) /usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 1700.255.5) ``` Is this intentional? ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 120822] Off-by-one error in expression mangling of parameter reference in lambda
Issue 120822 Summary Off-by-one error in _expression_ mangling of parameter reference in lambda Labels clang, c++14, miscompilation Assignees Reporter hubert-reinterpretcast According to the Itanium C++ ABI, the _expression_ mangling for a function parameter involves a number `L` where `L` is `1` when referencing a parameter of the current function declarator within its parameter declaration clause. In the case where `L` is one, the parameter reference is represented with a prefix of `fL0p` (that is with the value of `L` - 1 after the "L"). Consider the source below. The value of `L` should be `1`; however, the mangling used for `x` is `_ZZZ1fvENKUlT_DtfL1p_EE_clIiEEDaS_S0_E1x` (with `fL1p` instead of the expected `fL0p`). Oddly enough, the typeinfo string for the closure type (`Z1fvEUlT_DtfL0p_EE_`) has `fL0p` as expected. GCC has an off-by-one error in the opposite direction: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118144 Online compiler link: https://godbolt.org/z/51E9Yjvxf ### SOURCE (``) ```cpp inline auto f() { return [](auto p, decltype(p)) { static int x = 0; return &x; }; } void g() { throw f(); } auto h() { return f()(0, 0); } ``` ### COMPILER INVOCATION ``` clang++ -std=c++17 -xc++ - -O -S -emit-llvm -o - ``` ### ACTUAL COMPILER OUTPUT (partial) ``` ret ptr @_ZZZ1fvENKUlT_DtfL1p_EE_clIiEEDaS_S0_E1x ``` ### EXPECTED COMPILER OUTPUT (partial) ``` ret ptr @_ZZZ1fvENKUlT_DtfL0p_EE_clIiEEDaS_S0_E1x ``` ### COMPILER VERSION INFO (`clang++ -v`) ``` clang version 20.0.0git (https://github.com/llvm/llvm-project.git 44514316bd5ef656076b6baaf6bccb298d98f0ea) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /opt/wandbox/clang-head/bin Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/13 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/14 Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/14 Candidate multilib: .;@m64 Selected multilib: .;@m64 ``` ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 120823] [SLPVectorizer] `samesign` flags should be dropped after narrowing down the width of operands
Issue 120823 Summary [SLPVectorizer] `samesign` flags should be dropped after narrowing down the width of operands Labels miscompilation, llvm:SLPVectorizer Assignees Reporter dtcxzyw Reproducer: https://alive2.llvm.org/ce/z/AvBhy9 ``` ; bin/opt -passes=slp-vectorizer test.ll -S target triple = "x86_64-unknown-linux-gnu" define i1 @test() { entry: %and.i1698.1.i = zext i16 0 to i32 %and19.i1699.2.i = and i32 %and.i1698.1.i, 0 %and.i1698.2.i = zext i16 0 to i32 %cmp25.i1700.2.i2 = icmp samesign uge i32 %and19.i1699.2.i, %and.i1698.1.i %and19.i1699.11841.i = and i32 %and.i1698.2.i, 0 %cmp25.i1700.11842.i3 = icmp samesign uge i32 %and19.i1699.11841.i, %and.i1698.2.i %and.i1698.1.1.i = zext i16 0 to i32 %and19.i1699.2.1.i = and i32 %and.i1698.1.1.i, 0 %0 = add i16 1, 0 %and.i1698.2.1.i = zext i16 %0 to i32 %cmp25.i1700.2.1.i4 = icmp samesign uge i32 %and19.i1699.2.1.i, %and.i1698.1.1.i %and19.i1699.21846.i = and i32 %and.i1698.2.1.i, 0 %cmp25.i1700.21847.i = icmp samesign uge i32 %and19.i1699.21846.i, %and.i1698.2.1.i ret i1 %cmp25.i1700.21847.i } ``` ``` define i1 @test() { entry: %0 = add i16 1, 0 %1 = insertelement <4 x i16> , i16 %0, i32 0 %2 = trunc <4 x i16> %1 to <4 x i1> %3 = and <4 x i1> %2, zeroinitializer %4 = icmp samesign uge <4 x i1> %3, %2 %5 = extractelement <4 x i1> %4, i32 0 ret i1 %5 } ``` ``` define i1 @src() { entry: %#0 = add i16 1, 0 %and.i1698.2.1.i = zext i16 %#0 to i32 %and19.i1699.21846.i = and i32 %and.i1698.2.1.i, 0 %cmp25.i1700.21847.i = icmp samesign uge i32 %and19.i1699.21846.i, %and.i1698.2.1.i ret i1 %cmp25.i1700.21847.i } => define i1 @src() { entry: %#0 = add i16 1, 0 %#1 = insertelement <4 x i16> { poison, 0, 0, 0 }, i16 %#0, i32 0 %#2 = trunc <4 x i16> %#1 to <4 x i1> %#3 = and <4 x i1> %#2, { 0, 0, 0, 0 } %#4 = icmp samesign uge <4 x i1> %#3, %#2 %#5 = extractelement <4 x i1> %#4, i32 0 ret i1 %#5 } Transformation doesn't verify! ERROR: Target is more poisonous than source Example: Source: i16 %#0 = #x0001 (1) i32 %and.i1698.2.1.i = #x0001 (1) i32 %and19.i1699.21846.i = #x (0) i1 %cmp25.i1700.21847.i = #x0 (0) Target: i16 %#0 = #x0001 (1) <4 x i16> %#1 = < #x0001 (1), #x (0), #x (0), #x (0) > <4 x i1> %#2 = < #x1 (1), #x0 (0), #x0 (0), #x0 (0) > <4 x i1> %#3 = < #x0 (0), #x0 (0), #x0 (0), #x0 (0) > <4 x i1> %#4 = < poison, #x1 (1), #x1 (1), #x1 (1) > i1 %#5 = poison Source value: #x0 (0) Target value: poison Summary: 0 correct transformations 1 incorrect transformations 0 failed-to-prove transformations 0 Alive2 errors ``` ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 120825] [JumpThreadingPass] Local variable not init lead to miscompilation
Issue 120825 Summary [JumpThreadingPass] Local variable not init lead to miscompilation Labels miscompilation Assignees Reporter hstk30-hw https://godbolt.org/z/b7W6Mbnqr ``` #include "stdio.h" #include "stdint.h" typedef struct { uint8_t key_word[48]; uint32_t value; } ini_st; uint32_t ext_call(char *section, ini_st *section_cfg, uint32_t section_cfg_num); uint32_t callee(uint32_t *cfg) { ini_st arr[] = { {"foo", 0x}, }; uint32_t cfg_num = sizeof(arr) / sizeof(ini_st); uint32_t ret = ext_call("bar", arr, cfg_num); if (ret != (0)) { return ret; } if (arr[0].value == 0x) { return (1); } *cfg = arr[0].value; return (0); } unsigned int caller(void) { uint32_t cfg; if (callee(&cfg) == (0)) { printf("%d, %d\n", cfg, __LINE__); } printf("%d, %d\n", cfg, __LINE__); return 1; } ``` I don't think this code is Erroneous, even the variable `cfg` is not init. But JumpThreadingPass seem miscompile it. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs