[Bug target/117438] x86's pass_align_tight_loops may cause performance regression in nested loops

pinskia at gcc dot gnu.org via Gcc-bugs Mon, 04 Nov 2024 11:41:43 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117438


--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
>this may cause significant performance regression of some nested loops.

I suspect it depends on the micro-arch for the x86 target.

What are you running the test on?

        .p2align 6
.L3:

I notice GCC aligns only the inner loop to 64 byte boundary while clang/LLVM
aligns each loop (inner and outer) loops to 16 byte boundary.

[Bug target/117438] x86's pass_align_tight_loops may cause performance regression in nested loops

Reply via email to