https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115462
Hu Lin <lin1.hu at intel dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |lin1.hu at intel dot com --- Comment #3 from Hu Lin <lin1.hu at intel dot com> --- I looked up the hotspot for this test. At int2a.F:570 (we output its .f file int2a.fppized.f.), its source code is 566 DO 200 K = 1,MAX 567 MX = NX+KLX(K) 568 MY = NY+KLY(K) 569 MZ = NZ+KLZ(K) 570 N = N1+KLGT(K) 571 200 GHONDO(N) = ( XIN(MX )*YIN(MY )*ZIN(MZ ) +XIN(MX+625)*YIN(MY+625)* 572 + ZIN(MZ+625) +XIN(MX+1250)*YIN(MY+1250)*ZIN(MZ+1250) )*D1* 573 + DKL(K)+GHONDO(N) . At this loop's beginning, the original ASM code is mov 0x271e3c98(,%rdx,4),%edi mov 0x271e401c(,%rdx,4),%esi mov 0x271e43a0(,%rdx,4),%ecx mov 0x271e3914(,%rdx,4),%r8d . But after r15-882-g1d6199e5f8c1c0, the ASM code is mov $0x27bf6c98, %r10d mov $0x27bf701c, %r9d mov $0x27bf73a0, %esi movl (%rbx,%rdx,4), %ecx movl (%r10,%rdx,4), %edi movl (%r9,%rdx,4), %r8d movl (%rsi,%rdx,4), %esi . In addition to this loop other places also have some similar extra instructions. These instructions increase the instruction retired by about the similar percentage as the regression.