> On Sep 30, 2024, at 10:34, Yangyu Chen <chenyan...@isrc.iscas.ac.cn> wrote: >> >> >> On Sep 30, 2024, at 02:49, Jeff Law <jeffreya...@gmail.com> wrote: >> On 9/9/24 6:11 AM, Yangyu Chen wrote: >>> Currently, we lack support for TARGET_CAN_INLINE_P on the RISC-V >>> ISA. As a result, certain functions cannot be optimized with inlining >>> when specific options, such as __attribute__((target("arch=+v"))) . >>> This can lead to potential performance issues when building >>> retargetable binaries for RISC-V. >>> To address this, I have implemented the riscv_can_inline_p function. >>> This addition enables inlining when the callee either has no special >>> options or when the some options match, and also ensuring that the >>> callee's ISA is a subset of the caller's. I also check some other >>> options when there is no always_inline set. >>> gcc/ChangeLog: >>> * config/riscv/riscv.cc (riscv_can_inline_p): New function. >>> (TARGET_CAN_INLINE_P): Implement TARGET_CAN_INLINE_P. >> I'd kind of hoped not to mess with this as I suspect keeping this up-to-date >> is going to end up being somewhat painful and I doubt we're going to see >> that many cases where folks are using target attributes and where inlining >> is critical for performance. >> >> What's really driving your desire to change this? Is this showing up in >> real code as a performance issue? >> > > Yeah. I'm working on function multi-versioning support and found > that allowing the callee to be inlined in the caller will result > in performance gains. Without this patch, we will need to CALL the > function since we can't inline them when two functions have different > attributes. A simple evaluation of coremarks shows a 13% performance > degradation. >
Specially, we can reproduce the result on BananaPi-F3 Hardware: Use this GCC branch with my patch: https://github.com/cyyself/gcc/tree/rv_can_inline And compile the coremark on this branch: https://github.com/cyyself/coremark/tree/rva22_v_hotspot With command `make CC=riscv64-unknown-linux-gnu-gcc compile` With my patch, we will get the coremark scored `Iterations/Sec : 5992.917461`. But without this patch after `git reset HEAD^` and recompile the GCC and then coremark, we will get `Iterations/Sec : 5235.602094`, which is 12.6% slower. Thanks, Yangyu Chen >>> + >>> + if (caller_opts->x_riscv_tune_string >>> + && callee_opts->x_riscv_tune_string >>> + && strcmp (caller_opts->x_riscv_tune_string, >>> + callee_opts->x_riscv_tune_string) != 0) >>> + return false; >> Tuning shouldn't affect correctness of inlining. I'd just drop this clause >> if this patch goes forward. > > OK. I will fix it in the next revision. > >> Jeff >