[llvm-bugs] [Bug 80258] Wasm: Functions forcefully marked export by linker
Issue 80258 Summary Wasm: Functions forcefully marked export by linker Labels new issue Assignees Reporter ryandurkoske If I generate a relocatable wasm object file using wat2wasm, link the object in with the rest of my C program on wasm-ld, then all the functions defined in the object file are forcefully marked as exports. I'm excited that _it just works_, except for this particular annoyance. Marking the extern declaration with hidden visibility does not make a difference. I know its not an issue on wabt's part. Here is the disassembly of the relocatable obj file generated by wat2wasm: ``` (module (type (;0;) (func (param i32))) (import "env" "memory" (memory (;0;) 2 65536 shared)) (func $mtx_lock (type 0) (param i32) ...instructions...) (func $mtx_unlock (type 0) (param i32) ...instructions...) (@custom "linking" "\02\08\1b\02\00\00\00\08mtx_lock\00\00\01\0amtx_unlock")) ``` Here is my extern declaration of the functions in a header file: ``` __attribute__((visibility("default"))) extern void mtx_lock(mtx_t* mut_ptr); __attribute__((visibility("default"))) extern void mtx_unlock(mtx_t* mutex_ptr); ``` Clang C Source compiler flags: (clang version 17.0.6) ``` -std=c17 -target wasm32 -flto -Wall -g -O3 -nostdlib -fvisibility=hidden -mbulk-memory -msimd128 -matomics -mmutable-globals ``` wasm-ld flags: (LLD version 17.0.6) ``` --export-dynamic --unresolved-symbols=report-all --import-memory --error-limit=0 --stack-first --gc-sections -O3 --shared-memory --max-memory=4294967296 -zstack-size=65536 ``` For further clarity, heres the undesirable result from the disassembly of the wasm linker output: ``` ...function definitions... ...more exports... (export "mtx_lock" (func $mtx_lock)) (export "mtx_unlock" (func $mtx_unlock)) ... data ... ... @custom data ... ``` This is the reverse behavior of a similar issue with llvm-ar generated archive files. The linker or the archiver is stripping the visibility off of C declared/defined functions. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 80266] [BOLT] [llvm-bolt] llvm-bolt aborts due to .debug_line_str offset mismatch when --update-debug-sections is used.
Issue 80266 Summary [BOLT] [llvm-bolt] llvm-bolt aborts due to .debug_line_str offset mismatch when --update-debug-sections is used. Labels BOLT Assignees Reporter ARG-NK I am optimizing a ELF64/AArch64 binary compiled with -gdwarf4 and get this assert failure while updating the debug sections:   ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 80268] [Bug] Clang-tidy: Suggests removing and duplicating Hungarian Notation prefixes at the same time
Issue 80268 Summary [Bug] Clang-tidy: Suggests removing and duplicating Hungarian Notation prefixes at the same time Labels clang-tidy Assignees Reporter sdimovv `Clang-tidy` suggests removing and duplicating the Hungarian Notation prefixes at the same time. Here is a minimum reproducible example using `clang-tidy-18`: * `Clang-tidy` version: ```sh $ clang-tidy-18 --version Ubuntu LLVM version 18.1.0 Optimized build. ``` * `.clang-tidy` config file: ```yaml --- FormatStyle: file WarningsAsErrors: '*' Checks: > -*, readability-identifier-naming CheckOptions: readability-identifier-naming.GlobalVariablePrefix: g readability-identifier-naming.GlobalPointerPrefix: g readability-identifier-naming.TypedefSuffix: _t readability-identifier-naming.UnionSuffix: _t readability-identifier-naming.TypeAliasSuffix: _t readability-identifier-naming.ClassCase: CamelCase readability-identifier-naming.VariableCase: CamelCase readability-identifier-naming.ParameterCase: CamelCase readability-identifier-naming.PointerParameterCase: CamelCase readability-identifier-naming.ConstantParameterCase: CamelCase readability-identifier-naming.ConstantPointerParameterCase: CamelCase readability-identifier-naming.HungarianNotation.PrimitiveType.void: v readability-identifier-naming.HungarianNotation.PrimitiveType.int8_t: s8 readability-identifier-naming.HungarianNotation.PrimitiveType.int16_t: s16 readability-identifier-naming.HungarianNotation.PrimitiveType.int32_t: s32 readability-identifier-naming.HungarianNotation.PrimitiveType.int64_t: s64 readability-identifier-naming.HungarianNotation.PrimitiveType.float: f32 readability-identifier-naming.HungarianNotation.PrimitiveType.double: f64 readability-identifier-naming.HungarianNotation.UserDefinedType.PVOID: t readability-identifier-naming.HungarianNotation.UserDefinedType.INT8: t readability-identifier-naming.HungarianNotation.UserDefinedType.INT16: t readability-identifier-naming.HungarianNotation.UserDefinedType.INT32: t readability-identifier-naming.HungarianNotation.UserDefinedType.INT64: t readability-identifier-naming.HungarianNotation.UserDefinedType.uint8_t: t readability-identifier-naming.HungarianNotation.UserDefinedType.UINT16: t readability-identifier-naming.HungarianNotation.UserDefinedType.uint32_t: t readability-identifier-naming.HungarianNotation.UserDefinedType.UINT64: t readability-identifier-naming.HungarianNotation.UserDefinedType.float: t readability-identifier-naming.HungarianNotation.UserDefinedType.double: t readability-identifier-naming.HungarianNotation.General.TreatStructAsClass: true readability-identifier-naming.HungarianNotation.DerivedType.Array: r readability-identifier-naming.HungarianNotation.DerivedType.Pointer: p readability-identifier-naming.HungarianNotation.DerivedType.FunctionPointer: p readability-identifier-naming.VariableHungarianPrefix: LowerCase readability-identifier-naming.ClassHungarianPrefix: LowerCase readability-identifier-naming.LocalPointerHungarianPrefix: LowerCase readability-identifier-naming.GlobalPointerHungarianPrefix: LowerCase readability-identifier-naming.ParameterHungarianPrefix: LowerCase readability-identifier-naming.PointerParameterHungarianPrefix: LowerCase readability-identifier-naming.ConstantParameterHungarianPrefix: LowerCase readability-identifier-naming.ConstantPointerParameterHungarianPrefix: LowerCase readability-identifier-naming.TypedefHungarianPrefix: LowerCase readability-identifier-naming.TypeAliasHungarianPrefix: LowerCase readability-identifier-naming.UnionHungarianPrefix: LowerCase readability-identifier-naming.GlobalVariableHungarianPrefix: LowerCase ``` * `test.c`: ```c #include typedef uint32_t MyType32_t; typedef uint8_t MyType8_t; void my_func( MyType32_t t_MyType32, MyType32_t *pt_MyType32, MyType8_t t_MyType8, MyType8_t *pt_MyType8, const MyType32_t t_ConstMyType32, const MyType32_t *pt_ConstMyType32, const MyType8_t t_ConstMyType8, const MyType8_t *pt_ConstMyType8, uint32_t u32_Uint32, uint32_t *pu32_Uint32, uint8_t u8_Uint8, uint8_t *pu8_Uint8, const uint32_t u32_ConstUint32, const uint32_t *pu32_ConstUint32, const uint8_t u8_ConstUint8, const uint8_t *pu8_ConstUint8 ) {} ``` * Run command: ```bash $ clang-tidy-18 test.c ``` * Expected result: `clang-tidy` finds no errors * Actual result: ``` $ clang-tidy-18 test.c Error while trying to load a compilation database: Could not auto-detect compilation database for file "test.c" No compilation database found in /path/to/my/file or any parent directory fixed-compilation-database: Error while opening fixed database: No such file or directory json-compilation-database: Error while opening JSON
[llvm-bugs] [Bug 80270] [flang] out of memory optimizing character with negative length
Issue 80270 Summary [flang] out of memory optimizing character with negative length Labels flang Assignees Reporter tblah Two tests in the gfortran test suite attempt to create a character array with negative length: - https://github.com/llvm/llvm-test-suite/blob/main/Fortran/gfortran/regression/char_length_20.f90 - https://github.com/llvm/llvm-test-suite/blob/main/Fortran/gfortran/regression/char_length_21.f90 Both of these tests can be built successfully without optimisation, but at `-O1` my compilation crashed after the memory consumption went over 90GB compiling one translation unit. These tests are currently disabled by the testing infrastructure in llvm-test-suite, but it is useful sometimes to run even disabled tests to gather statistics about remaining issues. These tests block us from doing this with optimization enabled (which enables several more passes in flang and so hopefully shakes out more bugs). My flang-new compiler is a release with assertions build from 16c4843d32 ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 80275] [PRE] reuse the value to avoid double loading
Issue 80275 Summary [PRE] reuse the value to avoid double loading Labels new issue Assignees Reporter vfdff * test: https://gcc.godbolt.org/z/7eKGbndfK ``` void foo(dComplex* __restrict__ bb, bool flag) { *aa = *bb; //memcpy(aa, bb, sizeof(dComplex)); if (flag) { *aa = aa->conj(); // bb->neg(); } } void foo1(dComplex* __restrict__ bb, bool flag) { aa->real = bb->real; aa->imag = bb->imag; if (flag) { *aa = aa->conj(); } } ``` * llvm of **foo**: ldr often has large cost because of long latency, so reuse of the part value of q0 will be more efficient, which is similar to **foo1** ``` foo(dComplex*, bool): // @foo(dComplex*, bool) adrpx8, aa ldr q0, [x0] ldr x8, [x8, :lo12:aa] str q0, [x8] tbz w1, #0, .LBB0_2 ldr d0, [x8, #8] --- expect reuse the result of q0 (d0 is low part of q0) fnegd0, d0 str d0, [x8, #8] .LBB0_2: ret ``` ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 80276] LLD does not report forward reference errors in linker script
Issue 80276 Summary LLD does not report forward reference errors in linker script Labels Assignees Reporter partaror Unlike GNU LD, LLD does not report forward reference error. This can result in unexpected output image layout. Reproducible example: ```bash #!/usr/bin/env bash cat >1.c <<\EOF int foo() { return 1; } int bar() { return 3; } EOF cat >script.t <<\EOF SECTIONS { FOO ALIGN(ADDR(BAR), 0x1000) : { *(*foo*) } BAR : { *(*bar*) } } EOF clang-15 -o 1.o 1.c -c -ffunction-sections ld.bfd -o 1.bfd.elf 1.o -T script.t # reports the error, "ld.bfd:script.t:2: non constant or forward reference address _expression_ for section FOO" ld.lld -o 1.lld.elf 1.o -T script.t # Links fine; no error or warning reported related to forward reference ``` If the decision to not report forward references in linker script diagnostics is intentional, then can anyone please tell why is it so when it can lead to unexpected final layout? ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] Issue 65993 in oss-fuzz: llvm: Fuzzing build failure
Comment #2 on issue 65993 by ClusterFuzz-External: llvm: Fuzzing build failure https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=65993#c2 Friendly reminder that the build is still failing. Please try to fix this failure to ensure that fuzzing remains productive. Latest build log: https://oss-fuzz-build-logs.storage.googleapis.com/log-25f690c6-ed3d-4a81-b9e8-7277c045ea4c.txt -- You received this message because: 1. You were specifically CC'd on the issue You may adjust your notification preferences at: https://bugs.chromium.org/hosting/settings Reply to this email to add a comment.___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 80284] Crash when __datasizeof is used on a type with incomplete fields
Issue 80284 Summary Crash when __datasizeof is used on a type with incomplete fields Labels new issue Assignees ilya-biryukov Reporter ilya-biryukov ```cpp struct Bar; struct Foo { Bar x; }; int test() { constexpr int a = __datasizeof(Foo); } ``` See https://gcc.godbolt.org/z/q1eKveT68. The stack trace is the following: ``` #2 0x03527fc8 CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0 #3 0x7fe55b442520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520) #4 0x070002bc clang::ASTContext::getASTRecordLayout(clang::RecordDecl const*) const (/opt/compiler-explorer/clang-trunk/bin/clang+++0x70002bc) #5 0x070033b1 (anonymous namespace)::EmptySubobjectMap::ComputeEmptySubobjectSizes() RecordLayoutBuilder.cpp:0:0 #6 0x0700034b clang::ASTContext::getASTRecordLayout(clang::RecordDecl const*) const (/opt/compiler-explorer/clang-trunk/bin/clang+++0x700034b) #7 0x06b9c3ea clang::ASTContext::getTypeInfoDataSizeInChars(clang::QualType) const (/opt/compiler-explorer/clang-trunk/bin/clang+++0x6b9c3ea) #8 0x06dadff1 HandleSizeof((anonymous namespace)::EvalInfo&, clang::SourceLocation, clang::QualType, clang::CharUnits&, SizeOfType) ExprConstant.cpp:0:0 #9 0x06df63ec clang::StmtVisitorBase::Visit(clang::Stmt const*) ExprConstant.cpp:0:0 #10 0x06ddb66e Evaluate(clang::APValue&, (anonymous namespace)::EvalInfo&, clang::Expr const*) ExprConstant.cpp:0:0 #11 0x06de4b11 EvaluateAsRValue((anonymous namespace)::EvalInfo&, clang::Expr const*, clang::APValue&) ExprConstant.cpp:0:0 #12 0x06de6791 clang::Expr::EvaluateAsRValue(clang::Expr::EvalResult&, clang::ASTContext const&, bool) const (/opt/compiler-explorer/clang-trunk/bin/clang+++0x6de6791) #13 0x05e60457 GetExprRange(clang::ASTContext&, clang::Expr const*, unsigned int, bool, bool) SemaChecking.cpp:0:0 #14 0x05ebe59f CheckImplicitConversion(clang::Sema&, clang::Expr*, clang::QualType, clang::SourceLocation, bool*, bool) SemaChecking.cpp:0:0 #15 0x05ec13b7 AnalyzeImplicitConversions(clang::Sema&, clang::Expr*, clang::SourceLocation, bool) (.constprop.0) SemaChecking.cpp:0:0 ``` ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 80285] [coverage] crash with CXXUnresolvedConstructExpr in `if constexpr`
Issue 80285 Summary [coverage] crash with CXXUnresolvedConstructExpr in `if constexpr` Labels new issue Assignees Reporter hanickadot Clang `main` and `18` crashes with this code: (`-fprofile-instr-generate -fcoverage-mapping -std=c++17`) ```c++ struct false_value { constexpr operator bool() { return false; } }; void foo() { if constexpr (false_value{}) { }; } ``` https://godbolt.org/z/833ej19rP It's a bug in my recent change https://github.com/llvm/llvm-project/pull/78033 ... writing patch now, this is so it's visible. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 80286] std::use_facet doesn't check dynamic type
Issue 80286 Summary std::use_facet doesn't check dynamic type Labels new issue Assignees Reporter jwakely ```c++ #include int main() { struct custom_ctype : std::ctype { }; (void) std::use_facet(std::locale()); } ``` using libc++ this program has a UBsan error: ``` /usr/bin/../include/c++/v1/__locale:254:12: runtime error: downcast of address 0x7fe687842e30 which does not point to an object of type 'const custom_ctype' 0x7fe687842e30: note: object is of type 'std::__1::ctype' 00 00 00 00 e8 77 83 87 e6 7f 00 00 01 00 00 00 00 00 00 00 c0 93 59 87 e6 7f 00 00 00 00 00 00 ^~~ vptr for 'std::__1::ctype' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /usr/bin/../include/c++/v1/__locale:254:12 in ``` The implementation seems to assume that any facet with the right `id` is a match and just uses `static_cast`. This only work for standard facets, not user-defined ones. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 80287] [Bug][Clang][ARM] Wrong assembly code generation, "LR" corrupted.
Issue 80287 Summary [Bug][Clang][ARM] Wrong assembly code generation, "LR" corrupted. Labels clang Assignees Reporter P1119r1m The Clang compiler generates code for a function that overwrites the `lr` register without saving its data on stack beforehand. This bug was detected on a large software component and the problem was minimized to a PoC (see `example.c`). Clang versions info: - llvmorg-17-note - bug wasn't detected - llvmorg-17.0.2 - bug detected - llvmorg-17.0.6 - bug detected Host OS: Ubuntu 20.04.3 LTS In disassembler the bug looks like this (generated `example.c.o.objdump`): ``` 0010 : ; if (!ctx) { 10: cmp r0, #0 14: beq 0x48 @ imm = #0x2c<-- Goto 0x48 if "(!ctx)" is True. ; ctx->cmd_c = func_2; 18: ldr r12, [pc, #0x40]@ 0x60 <-- Continue if "(!ctx)" is False. 1c: mov r1, r0 20: mov r0, #0 ; ctx->cmd_a = func_0; 24: add lr, r1, #8 <-- ERROR. THIS LOOKS CRAZY!!! "lr" WAS NOT STORED ON THE STACK! ; ctx->cmd_c = func_2; 28: ldr r12, [pc, r12] ... < SOME OTHER CODE > ... 44: bx lr <-- ERROR. BRANCH TO "lr" ADDRESS THAT WAS PREVIOUSLY CORRUPTED!!! 48: push {r11, lr} ... < SOME OTHER CODE > ... ``` Steps to reproduce. - example.c ``` void __attribute__((optnone)) printer(int line) {} #define INFO() printer(__LINE__); extern int func_0(void); extern int func_1(void); extern int func_2(void); typedef int (*f_ptr_t)(void); typedef struct operation { double double_0; f_ptr_t cmd_a; f_ptr_t cmd_b; f_ptr_t cmd_c; f_ptr_t cmd_d; } operation_t; int BUGGY_FUNCTION(operation_t *ctx) { // INFO(); // UNCOMMENT TO FIX ASSEMBLY CODE GENERATION!!! if (!ctx) { INFO() return -1; } ctx->cmd_a = func_0; ctx->cmd_b = func_1; ctx->cmd_c = func_2; ctx->cmd_d = (void *)0; return 0; } ``` - reproduce.sh ``` #!/bin/bash SDK_BIN_PATH= CC="${SDK_BIN_PATH}/arm-secureos-gnueabi-clang" OBJDUMP="${SDK_BIN_PATH}/arm-secureos-gnueabi-objdump" INPUT_FILE=example.c OBJ_FILE=example.c.o OBJDUMP_FILE=example.c.o.objdump echo "Compile and objdump..." \ && "${CC}" -ggdb -Os -fPIE -c "${INPUT_FILE}" -o "${OBJ_FILE}" \ && "${OBJDUMP}" -dS "${OBJ_FILE}" | tee "${OBJDUMP_FILE}" ``` ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 80289] WRONG code: LoopUnroll / SCEVExpander with i128 induction variable.
Issue 80289 Summary WRONG code: LoopUnroll / SCEVExpander with i128 induction variable. Labels loopoptim Assignees Reporter JonPsson1 This reduced (csmith) test case seems well-defined, and should print '6': ``` char C = 0; __int128 IW = 0; int *IPtr1, *IPtr2; struct S2 { int f3; }; volatile struct S2 g_1100; int main() { for (; C <= 5; C += 1) for (; IW <= 5; IW += 1) { IPtr1 = IPtr2; g_1100; } int crc = IW; printf("checksum = %d\n", crc); } ``` ``` clang -target s390x-linux-gnu -march=z16 -O3 -mllvm -enable-load-pre=false -o ./a.out -mllvm -unroll-max-count=3; ./a.out checksum = 7 clang -target s390x-linux-gnu -march=z16 -O3 -mllvm -enable-load-pre=false -o ./a.out -mllvm -unroll-max-count=2; ./a.out checksum = 6 ``` However, when unrolled 3 times (not 2 or 4), the LoopUnroller creates a prologue loop, which is supposed to run extra iterations, as computed in the preheader (LoopUnrollRuntime.cpp:766): ``` for.body5.preheader: ; preds = %for.cond2thread-pre-split %2 = sub i128 6, %.pr121517 %3 = freeze i128 %2 %4 = add i128 %3, 18446744073709551615 %5 = urem i128 %4, 3 %6 = add i128 %5, 1 %xtraiter = urem i128 %6, 3 %lcmp.mod = icmp ne i128 %xtraiter, 0 br i1 %lcmp.mod, label %for.body5.prol.preheader, label %for.body5.prol.loopexit ``` The constant used for %4 is actually is supposed to be i128 '-1', so UINT64_MAX (i64 -1) doesn't make sense. i128 <> i64, after LoopUnroller: ``` for.body5.preheader: for.body5.preheader: %2 = sub i128 6, %.pr121517 | %2 = sub i64 6, %.pr121517 %3 = freeze i128 %2 | %3 = freeze i64 %2 %4 = add i128 %3, 18446744073709551615 | %4 = add i64 %3, -1 %5 = urem i128 %4, 3 | %5 = urem i64 %4, 3 %6 = add i128 %5, 1 | %6 = add i64 %5, 1 %xtraiter = urem i128 %6, 3 | %xtraiter = urem i64 %6, 3 %lcmp.mod = icmp ne i128 %xtraiter, 0 | %lcmp.mod = icmp ne i64 %xtraiter, 0 br i1 %lcmp.mod, label %for.body5.prol.preh br i1 %lcmp.mod, label %for.body5.prol.preh ``` %4 is later optimized to a sub i128 with a folded constant of 18446744073709551621, which really should be '5'. @nikic @boxu-zhang @xiangzh1 @preames @uweigand ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 80294] Assertion `I != VRBaseMap.end() && "Node emitted out of order - late"' failed.
Issue 80294 Summary Assertion `I != VRBaseMap.end() && "Node emitted out of order - late"' failed. Labels new issue Assignees Reporter TatyanaDoubts https://godbolt.org/z/1Pxe4dqzd Run llc with Test.ll Test.ll ``` target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128-ni:1-p2:32:8:8:32-ni:2" target triple = "x86_64-unknown-linux-gnu" ; Function Attrs: uwtable define void @foo() gc "statepoint-example" { bb: %icmp = icmp eq i32 0, 0 br i1 %icmp, label %bb3, label %bb1 bb1: ; preds = %bb %call = call token (i64, i32, ptr, i32, i32, ...) @llvm.experimental.gc.statepoint.p0(i64 0, i32 0, ptr nonnull elementtype(void ()) null, i32 0, i32 0, i32 0, i32 0) [ "deopt"(i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, ptr addrspace(1) null, i32 0, ptr null, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, double 0.00e+00, i32 0, ptr null, i32 0, ptr addrspace(1) null, i32 0, ptr addrspace(1) null, i32 0, ptr addrspace(1) null, i32 0, ptr addrspace(1) null), "gc-live"(ptr addrspace(1) null, ptr addrspace(1) null, ptr addrspace(1) null, ptr addrspace(1) null, ptr addrspace(1) null, ptr addrspace(1) undef, ptr addrspace(1) null) ] %call2 = call coldcc ptr addrspace(1) @llvm.experimental.gc.relocate.p1(token %call, i32 0, i32 5) ; (null, undef) br label %bb3 bb3: ; preds = %bb1, %bb %phi = phi ptr addrspace(1) [ null, %bb ], [ %call2, %bb1 ] store atomic i32 0, ptr addrspace(1) %phi unordered, align 4 ret void } declare token @llvm.experimental.gc.statepoint.p0(i64 immarg, i32 immarg, ptr, i32 immarg, i32 immarg, ...) declare ptr addrspace(1) @llvm.experimental.gc.relocate.p1(token, i32 immarg, i32 immarg) ``` ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 80298] Inefficient codegen for loop with known small trip count
Issue 80298 Summary Inefficient codegen for loop with known small trip count Labels Assignees Reporter dzaima The code ```c #include void f(char* a, uint8_t l) { for (int i = 0; i < l; i++) a[i]++; } ``` compiled with `-O3 -march=haswell` (or `-O3 -mavx2` or similar; [compiler explorer](https://godbolt.org/z/bnfe5j3oE)), generates unrolled `xmm` operations, even though `ymm` ones can trivially replace them. Same thing happens with an assume of length: ```c void f(char* a, int l) { __builtin_assume(l < 256); for (int i = 0; i < l; i++) a[i]++; } ``` ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 80301] Loop Fusion crashes because control flow equivalences are broken after fusion
Issue 80301 Summary Loop Fusion crashes because control flow equivalences are broken after fusion Labels new issue Assignees Reporter RouzbehPaktinat We are trying to fuse loops in 481.wrf benchmark from SPEC, but we are getting errors after fusing some loops. The CFG before doing any fusion is provided [here](https://github.com/llvm/llvm-project/assets/148104143/7db2824e-f10b-48ea-8ea7-49f5bd4a8946). Initially loops 704, 710, 751 and few other ones are identified as fusion candidates. Branch conditions of the blocks 702 and 749 are the same so loop 704 dominates both 710 and 751 while loop 751 post-dominates 710 and 704. This makes them control flow equivalent and thus candidates to be fused together. However, right after loops 704 and 710 get fused together we get [this](https://github.com/llvm/llvm-project/assets/148104143/f6f12c32-bbbd-44e9-8e48-95c7cb08015e). The problem here is that the resulting fused loop (704) is guarded by block 702 (before fusion loops 704 and 710 are not guarded). This means that now loop 751 should post-dominate block 702 instead of 704 in order to be control flow equivalent to the fused loop (which clearly doesn't). This makes the fusion candidate set invalid and results in error. We would appreciate your thoughts and any possible solution! ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 80314] [clang-format] Indentation not reset with lvalue ref-qualifier and `requires` clause
Issue 80314 Summary [clang-format] Indentation not reset with lvalue ref-qualifier and `requires` clause Labels clang-format Assignees Reporter mikezackles Using the defaults: ```c++ template struct Foo { int bar() & requires(N == I) { return 0; } int baz() { return 0; } }; ``` vs. ```c++ template struct Foo { int bar() requires(N == I) { return 0; } int baz() { return 0; } }; ``` (ref-qualifier removed) Things look worse with less trivial code. Arch Linux, version 16.0.6 (apologies if this is already fixed -- I don't have access to 17 at the moment) ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 80334] Teach __is_convertible extension about fixed point types
Issue 80334 Summary Teach __is_convertible extension about fixed point types Labels new issue Assignees PiJoules Reporter PiJoules `std::is_convertible` dispatches to this extension. Fixed point types are convertible to the other integral types and floating types according to [N1169](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1169.pdf), so we should teach fixed point types to the extension. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 80336] [lld][ELF] vma gets bumped with MEMORY command / TBSS sections
Issue 80336 Summary [lld][ELF] vma gets bumped with MEMORY command / TBSS sections Labels lld Assignees Reporter shankarke The below test when built using lld, bumps the VMA by the size of the tbss section. ` cat > 1.c << \! __thread int a = 0; int b = 10; int data = "" int foo() { return a + b; } ! cat > script.t << \! MEMORY { MYMEM (rwx) : ORIGIN = 0x1000, LENGTH = 0x8 } SECTIONS { .foo : { *(.text.foo) } > MYMEM .tbss.a : { *(.tbss.a) } > MYMEM .bss.b : { *(.data.b) } > MYMEM .data : { *(.data*) } > MYMEM } ! CC=clang CCOPTS="-target riscv32" LD=ld.lld $CC $CCOPTS -c 1.c -ffunction-sections -fdata-sections -G0 -fno-asynchronous-unwind-tables $LD -m elf32lriscv 1.o -T script.t ` Section dump :- ` Section Headers: [Nr] Name TypeAddr OffSize ES Flg Lk Inf Al [ 0] NULL 00 00 00 0 0 0 [ 1] .foo PROGBITS1000 001000 1e 00 AX 0 0 2 [ 2] .text PROGBITS101e 00101e 00 00 AX 0 0 2 [ 3] .tbss.a NOBITS 1020 001020 04 00 WAT 0 0 4 [ 4] .bss.bPROGBITS 1024 001024 04 00 WA 0 0 4 [ 5] .data PROGBITS 1028 001028 04 00 WA 0 0 4 [ 6] .comment PROGBITS 00102c 38 01 MS 0 0 1 [ 7] .riscv.attributes RISCV_ATTRIBUTES 001064 2b 00 0 0 1 [ 8] .symtab SYMTAB 001090 c0 10 10 8 4 [ 9] .shstrtab STRTAB 001150 56 00 0 0 1 [10] .strtab STRTAB 0011a6 30 00 0 0 1 ` If you see the VMA address of .bss.b, it is bumped by the size of the .tbss.a section. This happens only when users use the MEMORY command. GNU linker layout ` Section Headers: [Nr] Name TypeAddr OffSize ES Flg Lk Inf Al [ 0] NULL 00 00 00 0 0 0 [ 1] .foo PROGBITS1000 001000 1c 00 AX 0 0 2 [ 2] .tbss.a NOBITS 101c 00101c 04 00 WAT 0 0 4 [ 3] .bss.bPROGBITS 101c 00101c 04 00 WA 0 0 4 [ 4] .data PROGBITS1020 001020 04 00 WA 0 0 4 [ 5] .comment PROGBITS 001024 24 01 MS 0 0 1 [ 6] .riscv.attributes RISCV_ATTRIBUTES 001048 2b 00 0 0 1 [ 7] .symtab SYMTAB 001074 000120 10 8 14 4 [ 8] .strtab STRTAB 001194 2e 00 0 0 1 [ 9] .shstrtab STRTAB 0011c2 50 00 0 0 1 ` ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 80357] Error from inline asm for missing target feature despite enabling target feature when using LTO
Issue 80357 Summary Error from inline asm for missing target feature despite enabling target feature when using LTO Labels bug, backend:RISC-V, LTO Assignees Reporter ilovepi When compiling a file for LTO/ThinLTO, we ran into an issue where an inline assembly directive for compressed instruction gave an error, despite us setting the `march` string correctly. Note, this occurs when generating the bitcode output in the compile step, and not during the link. We also don't see any such error in the non-LTO cases. This still occurs at ToT (1d1432356e6) and back at least as far as c58bc24fcf678c55b0b. I haven't checked farther back in the commit history than Fuchsia's previous toolchain, but I think this is probably not a new behavior/bug. I confirmed that `-target-feature +c` appears in the `cc1` command line. I was able to reduce this down to just an `asm(c.ebreak)` statement. I've included the original file along w/ the reduced case, and reproducer script w/ a `-cc1` invocation. This code come from Fuchsia and can be found here: https://cs.opensource.google/fuchsia/fuchsia/+/main:zircon/system/utest/inspector/print_debug_info.cc;l=131?q=print_debug_info.cc&ss=fuchsia%2Ffuchsia Failing bot: https://ci.chromium.org/ui/p/turquoise/builders/global.ci/minimal.riscv64-lto/b8757351392559542993/overview [reproducer.zip](https://github.com/llvm/llvm-project/files/14131777/reproducer.zip) This seems related to https://github.com/llvm/llvm-project/issues/67698, but I'm not totally sure. There were also some recent changes that tried to plumb these to the MC layer and I thought AsmParser, but I'm also unsure if these are related. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 80365] [Clang][C++17] static inline members are not initialized if they are part of a template even when instanced.
Issue 80365 Summary [Clang][C++17] static inline members are not initialized if they are part of a template even when instanced. Labels clang Assignees Reporter Dandielo In the example below, the behavior of the `static inline` foo member behaves in unexpected ways on Clang, GCC and ICX. Compared to a non-templated type with `static inline` member, even if instantiated, the template based type does not initialize it's members even when side effects should be observed. Since how unexpected and weird this behavior is I would guess it's not intended. This behavior is not present in MSVC, all static inline members are initialized as expected once the template type is resolved to a specific type. ``` #include struct Foo { Foo() { printf("Foo()\n"); } }; template struct Bar { //! Uncomment to make MaybeInitializing work //! Forces 'foo' to be initialized // Bar() { (void)foo; } //! Empty ctor does not help with foo to be initialized. // Bar() { } static inline Foo const foo{ }; }; struct NotInitializing : Bar { // Does not initialize Bar::foo unless instanced and Bar() ctor with 'foo' is available. }; struct MaybeInitializing : Bar { static inline Bar does_work_with_ctor_in_bar; }; struct Works { //! Uncomment to see properly initialized member // static inline Foo foo; }; int main() { //! Instancing both types does not initialize the static inline members //! comming from the template base type. Unless 'Bar' constructor is defined and accesses foo. // NotInitializing not_initializing; // MaybeInitializing maybe_initializing; //! Directly accessing 'foo' initializes it. // (void)NotInitializing::foo; // (void)MaybeInitializing::foo; return 0; } ``` [GodBolt Version](https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEIJJcpAAOqAqETgwe3r7%2ByanpAiFhkSwxcQm2mPaOAkIETMQEWT5%2BAXaYDhl1DQRFEdGx8YkK9Y3NOW2jvaH9pYMJAJS2qF7EyOwc5gDMocjeWADUJltuyCP4qMfYJhoAgje3I8ReDgcAYqiXdyYA7FZ3B0B70%2BEAWDyBRz%2B4IhQKSxFCBCoEHMZg%2BqFBJgArG4GCiwVt/rcIb8ACIPUnHQkPAiYFhJAw045uAgATySjFYmAOABUrg8ni8CAcLA1yVCAUCAPSSsBgA5yBhoFhsQQHIgHFhMADWXIAskwWTEAJIMdJiPAAL1CwAOAHcSFroYDpbKgR81pgFAcwBx%2BKgfWrUAcYgdQmbaJbMOgnQdpcKGqDIRYDhAAG6oPDoBZ%2BymQsnfCXOmVy7B01kHBwkA7oVCeg4MVBChBVJJ2wgIA5%2BwPBrlhxzmi1RgB0MbjIuIid%2ByYpBaJQJGTEcyFDDAjYWBQbQDBGnc%2BU7zlLFZIJ5LuAte4UbJvDlutBxA8eITMvBGv/YjVqMfO%2B4rnRYOJK1l6DZCn2eADly47PleprvreX5bNgIAgF2Xirp6XqhAuipRgchjoI%2BiaVsQbYEB2Pp%2BgGeBekwqZMHQTBRPQI4/seVJngQzyvPqhqYG%2B4EfneD5QScPHGrBAnwcA373L%2BEILkuK5rpBoqiQa4k3p%2B0mIdWQEAPr2sQWoGe2enEXpoR6VEqlUj8bGno8nGCgcADqDoKGKhIQi6coKkqKpCuqShcnCqDssQtAsiumm4WwcxPoWsaSgcCl4MuoTKRuu6XCerGHrOCIavRDAYj%2BXlSsWBwmthyB3lEjYdqy7JejWdYgdFcGDmqzYpfUikZTMGq0mUHmJT5QL%2BXeVDEKgLDdVyNJ0gyXLWUoapspgQ7yuhChej644Blu55ECR1HVpgNBhAR%2BF4cg6y7XWfosX%2BSUHC%2B/EDneIEWRJn0IeV/5iXxv2CUYRW8T9mnWvl9xjZVJJ4MQHQEJFt33WkYMUZ8VEg5GmEEM93nJWmGZZu9uNachOa5S9cYk5mCxAx9oOgChe40xCSMEKsDAHBoMMzrcHBLLQnCYrwfgcFopCoJwbjWNYKUrB6RxmFsPCkAQmjC0sWogJiWxDj8GhcFsABskgABySBoGiSDbFv6Jwki8CwEi26QkvS7LHC8AoIAaJr2tLHAsAwIgKCzUkdCxOQlBKtH9BxKmyBJEkemplwACcenLSMekNCwjs0LQNLEP7EBRNrpBRKEDQspwGu18wxAsgA8lE2jI43vD%2BYwBBt6uDdS7wWBRF4wBuGItD%2B9wo%2B0oYwDiCPpD4EjnSpp61eYKoHReDSPfkIIVTVxGUTEPXHhYNXnF4G7c9LFQBjAAoABqeCYLabfspLGv8IIIgxDsCkDIQQigVDqBXroAIBgjAoAVpYfQeAoj%2B0gEsMKNRtycElOceBlhrBcB%2BLGNuZheCoE3sQeEWBUGgkqNUDILgGDuE8C0PQwQZglDKHoFIaRMHjD8IkHhBQGB9E4YMYYVRkZdCmPwvQ7ROi1CmKIgYcRhgyJYTkNRPRlHxQkEsBQyt1h6KdhwcWntq4%2BzwsQIukgNQKBTgcTOWchx5yFBAXAhAqzbC4AsXgWsR4LF1vESQQ57Y/CzpiMwPwthZyzlsGJZhMQmJdqQN2WwNBDi2JIKJCQ4mYitlsS2PwAhezIZwP2Acg4BNIKHCOCcY5kAoBAepScUCwOAHpJyDAtR8DoGXCuVcV7N3rofYZrcO5dwcIfPughB6RWrmPCeU9aAz0PlgTURhl7SzXlIzes9pY7z3gfOeR8aSixXmfC%2Brcr4bGlrfe%2BPBH7Pzfh/L%2BP9D7/2EKIcQIDPngLUNXXQiR2l4KsIgs%2BND0FJEwbPAAtOcY4JJTD4MsIQg4sK25bDIRQqhW80F0Kkc4CArhZGJHYcUFR3D8h8I0QIvIvCMg6K4RI%2BhiieikoJQohg3RGhMvEbYdR2Q6ULl5RwylPjlirCMRK85ZjSky04AcVQlszawothWdpKYulagWCmDxJ1Vbq18VUrQgTSDNiYFgOItDzkpLSVwI2%2BSNBbDMIky2GgLaYi4GbcxK8fYVMDv401NTw4QCQC02OTSI1xGAKbAIJd%2BmUEGdLMZw8m513GZ3buJyZkDyHgszA49J7T1nhrdZi8tmj0RrsreK9DnIH3hsDWCIT4XOQVclkNyb7wgecLPgzz36f2/owD5shAE/OkH8pQAKoEgC2PoReoLrBIJQfAKFMLODwoINGLYSKEEWDRRirFMscWZjxbQ%2BRmDGHMKFWwphfLVH0uERyoRmCH1yMkVynlTRaUftZdypRYrdFaLGL%2BkD0wKXAf0YY4BIsxYSwsYq5Vqr1XADugcU2Q4zB6vwAa7xxqg063NZgS1gwbXO1dnOn4Q4s7OtdZid1nrvXet9d7cpthKmEbNXrSQZssNZxVVsL1mJYmYhE0k85x75X%2BpNUR85pDWNlN9rJs1FC0jOEkEAA) ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 80373] LlVm 15.0.7: If operand to CreateLogicalAnd was created by CreateAtomicCmpXchng, the evaluation order is wrong.
Issue 80373 Summary LlVm 15.0.7: If operand to CreateLogicalAnd was created by CreateAtomicCmpXchng, the evaluation order is wrong. Labels new issue Assignees Reporter Bob64375 Both of these statements generate code that evaluates CreateAtomicCmpXchg first (we use the 2nd i1 return from CreateAtomicCmpXchg). This makes it a bit awkward to short circuit the evaluation of CreateAtomicCmpXchg because the evaluation is never left-to-right. if ( x!=0 && CreateAtomicCmpXchg ) or, swapping operands to CreateLogicalAnd, if ( CreateAtomicCmpXchg && x!=0 ) 09710063 mov edi,dword ptr [ebp+8] 09710066 mov edx,dword ptr [edi+28h] 09710069 mov ecx,dword ptr [edi+2Ch] 0971006C mov bl,1 0971006E xor eax,eax 09710070 lock cmpxchg byte ptr [edx],bl <--- CreateAtomicCmpXchg 09710074 fldz 09710076 fldz 09710078 jne 097100CE 0971007A fstpst(1) 0971007C fstp st(0) 0971007E testecx,ecx<-- x!=0 09710080 fldz It would be nice if the evaluation were in the order of operands given to CreateLogicalAnd. Sorry if it's already been fixed, but thanks! ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 80374] Include Regrouping May Break If Macros Inserted Between Groups
Issue 80374 Summary Include Regrouping May Break If Macros Inserted Between Groups Labels new issue Assignees Reporter GreenYun I have configured clang-format: ```yaml IncludeBlocks: Regroup IncludeCategories: - Regex: '^["<]config.h[">]$' Priority:-1 CaseSensitive: true - Regex: '^$' Priority:2 CaseSensitive: false - Regex: '^<.*' Priority: 2 CaseSensitive: false - Regex: '.*' Priority:10 CaseSensitive: false ``` And my C code `a.c`: ```c #include "config.h" #ifndef _GNU_SOURCE #define _GNU_SOURCE #endif #include "a.h" #include #include #include "b.h" ``` However, clang-format (version 17.0.6) may not treat `a.h` as the main header but match `a.h` to regex `'.*'`: ```c #include "config.h" #ifndef _GNU_SOURCE #define _GNU_SOURCE #endif #include #include #include "a.h" #include "b.h" ``` Meanwhile, if I remove the `#ifndef` block, everything works fine. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 80385] [tool][objdump] llvm-objdump in arrch32 disassembly without __stack_chk_fail while applying -fstack-protector
Issue 80385 Summary [tool][objdump] llvm-objdump in arrch32 disassembly without __stack_chk_fail while applying -fstack-protector Labels new issue Assignees Reporter Zhenhang1213 `#include #include #include int func() { char c = 'b'; char str1[8]; //set memset(str1, c, 8); str1[7] = '\0'; //get printf("str1=%s\n", str1); return 0; } int main() { func(); return 0; } ` On a 32-bit architecture, when I compile this code into an object file and disassemble it using the objdump tool, the generated assembly instructions do not have the __stack_chk_fail function call, however arrch64 not. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 80388] [AArch64][ISel] Better instruction select when load/store?
Issue 80388 Summary [AArch64][ISel] Better instruction select when load/store? Labels backend:AArch64, llvm:codegen, llvm:globalisel, llvm:SelectionDAG Assignees Reporter hstk30-hw https://godbolt.org/z/4fGa3xd7o ``` #include #include #include typedef uint32_t u32; void *copy(void *restrict dest, const void *restrict src, size_t n) { unsigned char *d = dest; const unsigned char *s = src; uint32_t w, x; if ((uintptr_t)d % 4 == 0) { for (; n>=16; s+=16, d+=16, n-=16) { *(u32 *)(d+0) = *(u32 *)(s+0); *(u32 *)(d+4) = *(u32 *)(s+4); *(u32 *)(d+8) = *(u32 *)(s+8); *(u32 *)(d+12) = *(u32 *)(s+12); } return dest; } return dest; } ``` In this case, GCC load 8 + 8 bytes in loop, but Clang load 4 + 8 + 4 in loop. ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 80390] Merge "[ELF] Fix compareSections assertion failure when OutputDescs in sectionCommands are non-contiguous" into release/18.x
Issue 80390 Summary Merge "[ELF] Fix compareSections assertion failure when OutputDescs in sectionCommands are non-contiguous" into release/18.x Labels release:backport Assignees Reporter MaskRay /cherry-pick dee8786f70a3d62b639113343fa36ef55bdbad63 ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
[llvm-bugs] [Bug 80392] [RISCV] Missing opportunities to optimize RVV instructions
Issue 80392 Summary [RISCV] Missing opportunities to optimize RVV instructions Labels Assignees Reporter wangpc-pp In the SelectionDAG level, we have several code paths to generate RVV pseudos: 1. RVV intrinsics -> RVV pseudos. 2. ISD nodes -> RVV pseudos. 3. RISCVISD nodes -> RVV pseudos. 4. RVV intrinsics -> RISCVISD nodes -> RVV pseudos. 5. ISD nodes -> RISCVISD nodes -> RVV pseudos. 6. etc. Most of the optimizations for RVV are based on RISCVISD nodes, so we may miss some opportunities to optimize some codes. For example (https://godbolt.org/z/f1jWEfhG7): ```c vuint8m1_t dup(uint8_t* data) { return __riscv_vmv_v_x_u8m1(*data, __riscv_vsetvlmax_e8m1()); } vuint8m1_t dup2(uint8_t* data) { return __riscv_vlse8_v_u8m1(data, 0, __riscv_vsetvlmax_e8m1()); } ``` ```asm dup: vsetvli a1, zero, e8, m1, ta, ma vlse8.v v8, (a0), zero ret dup2: vsetvli a1, zero, e8, m1, ta, ma vlse8.v v8, (a0), zero ret ``` These two snippets are of same assemblies because we lower intrinsics of `vmv.v.x` to `RISCVISD::VMV_V_X` first, and then we can optimize it to zero-stride load if profitable. But, this is not common for other cases: ```c vuint16m2_t vadd(vuint16m2_t a, vuint8m1_t b) { int vl = __riscv_vsetvlmax_e8m1(); vuint16m2_t c = __riscv_vzext_vf2_u16m2(b, vl); return __riscv_vadd_vv_u16m2(a, c, vl); } vuint16m2_t vwaddu(vuint16m2_t a, vuint8m1_t b) { return __riscv_vwaddu_wv_u16m2(a, b, __riscv_vsetvlmax_e16m2()); } ``` ```asm vadd: vsetvli a0, zero, e16, m2, ta, ma vzext.vf2 v12, v10 vadd.vv v8, v8, v12 ret vwaddu: vsetvli a0, zero, e8, m1, ta, ma vwaddu.wv v8, v8, v10 ret ``` We can't optimize `vzext.vf2+vadd.vv` to `vwaddu.wv`, because we lower these intrinsics to RVV pseudos directly. Of cource, there is the same problem for `ISD->RVV pseudos` path: ```c typedef vuint8m1_t v16xi8 __attribute__((riscv_rvv_vector_bits(__riscv_v_fixed_vlen))); typedef vuint16m2_t v16xi32 __attribute__((riscv_rvv_vector_bits(__riscv_v_fixed_vlen * 2))); v16xi32 add(v16xi32 a, v16xi8 b) { v16xi32 c = __riscv_vzext_vf2_u16m2(b, 16); return a + c; } ``` ```asm add: vsetivlizero, 16, e16, m2, ta, ma vzext.vf2 v12, v10 vadd.vv v8, v12, v8 ret ``` I think we need to an universal representation (RISCVISD?) to do optimizations. But when GISel is supported, we may need to do all the optimizations on GIR again? Or should we move all optimizations to later MIR passes? ___ llvm-bugs mailing list llvm-bugs@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs