https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118076
Bug ID: 118076 Summary: Missed Optimization: Inefficient Stack Usage in Creating and Passing Large Struct Argument Product: gcc Version: 14.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: jonathan.gruber.jg at gmail dot com Target Milestone: --- Created attachment 59885 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59885&action=edit Minimal test case. When passing creating a large struct and passing it as an argument, gcc makes inefficient use of the stack. A simple test case is in the attached file test.c. I observed this bug on (non-cross-compiling) gcc as well as the aarch64 and riscv64 cross-compilers (I did not check any other cross-compilers, though). The minimal command-line options to reproduce the bug are -O2, -O3, -Os, or -Oz. Below is the test case in the attached test.c, for your convenience: struct S { void *x, *y, *z, *w; }; extern int extern_func(struct S s); int fwd_func(void *x, void *y, void *z, void *w) { struct S s = { x, y, z, w }; return extern_func(s); } Below is the generated assembly for gcc with -O3: sub rsp,0x48 mov QWORD PTR [rsp+0x20],rdi mov QWORD PTR [rsp+0x28],rsi movdqa xmm0,XMMWORD PTR [rsp+0x20] mov QWORD PTR [rsp+0x30],rdx mov QWORD PTR [rsp+0x38],rcx movups XMMWORD PTR [rsp],xmm0 movdqa xmm0,XMMWORD PTR [rsp+0x30] movups XMMWORD PTR [rsp+0x10],xmm0 call extern_func add rsp,0x48 ret Below is the generated assembly for aarch64-linux-gnu-gcc with -O3: stp x29, x30, [sp, #-80]! mov x29, sp stp x0, x1, [sp, #48] add x0, sp, #0x10 stp x2, x3, [sp, #64] ldp q30, q31, [sp, #48] str q30, [sp, #16] str q31, [x0, #16] bl extern_func ldp x29, x30, [sp], #80 ret And below is the generated assembly for riscv64-linux-gnu-gcc with -O3: addi sp,sp,-80 mv a5,a0 mv a0,sp sd ra,72(sp) sd a5,0(sp) sd a1,8(sp) sd a2,16(sp) sd a3,24(sp) auipc ra,0x0 jalr ra ld ra,72(sp) addi sp,sp,80 ret On x86_64 and aarch64, gcc needlessly stores the parameters to fwd_func on the stack before storing them again on the stack to create the struct parameter to extern_func, thereby storing the parameters twice on the stack rather than once. Apparently, riscv64 attempted to do the same thing, since it decreased sp way more than necessary, but seems to have optimized away the redundant store instructions themselves. Host system type: Arch Linux, x86_64 gcc information: Version: 14.2.1 20240910 (GCC) Configured with: /build/gcc/src/gcc/configure --enable-languages=ada,c,c++,d,fortran,go,lto,m2,objc,obj-c++,rust --enable-bootstrap --prefix=/usr --libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=https://gitlab.archlinux.org/archlinux/packaging/packages/gcc/-/issues --with-build-config=bootstrap-lto --with-linker-hash-style=gnu --with-system-zlib --enable-__cxa_atexit --enable-cet=auto --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object --enable-libstdcxx-backtrace --enable-link-serialization=1 --enable-linker-build-id --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=posix --disable-libssp --disable-libstdcxx-pch --disable-werror aarch64-linux-gnu-gcc information: Version: 14.2.0 Configured with: /build/aarch64-linux-gnu-gcc/src/gcc-14.2.0/configure --prefix=/usr --program-prefix=aarch64-linux-gnu- --with-local-prefix=/usr/aarch64-linux-gnu --with-sysroot=/usr/aarch64-linux-gnu --with-build-sysroot=/usr/aarch64-linux-gnu --with-native-system-header-dir=/include --libdir=/usr/lib --libexecdir=/usr/lib --target=aarch64-linux-gnu --host=x86_64-pc-linux-gnu --build=x86_64-pc-linux-gnu --disable-nls --enable-default-pie --enable-languages=c,c++,fortran --enable-shared --enable-threads=posix --with-system-zlib --with-isl --enable-__cxa_atexit --disable-libunwind-exceptions --enable-clocale=gnu --disable-libstdcxx-pch --disable-libssp --enable-gnu-unique-object --enable-linker-build-id --enable-lto --enable-plugin --enable-install-libiberty --with-linker-hash-style=gnu --enable-gnu-indirect-function --disable-multilib --disable-werror --enable-checking=release riscv64-linux-gnu-gcc information: Version: 14.2.0 Configured with: /build/riscv64-linux-gnu-gcc/src/gcc-14.2.0/configure --prefix=/usr --program-prefix=riscv64-linux-gnu- --with-local-prefix=/usr/riscv64-linux-gnu --with-sysroot=/usr/riscv64-linux-gnu --with-build-sysroot=/usr/riscv64-linux-gnu --libdir=/usr/lib --libexecdir=/usr/lib --target=riscv64-linux-gnu --host=x86_64-pc-linux-gnu --build=x86_64-pc-linux-gnu --with-system-zlib --with-isl --with-linker-hash-style=gnu --disable-nls --disable-libunwind-exceptions --disable-libstdcxx-pch --disable-libssp --disable-multilib --disable-werror --enable-languages=c,c++ --enable-shared --enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu --enable-gnu-unique-object --enable-linker-build-id --enable-lto --enable-plugin --enable-install-libiberty --enable-gnu-indirect-function --enable-default-pie --enable-checking=release