https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117081
--- Comment #13 from Hongtao Liu <liuhongt at gcc dot gnu.org> --- (In reply to H.J. Lu from comment #10) > (In reply to Hongtao Liu from comment #9) > > (In reply to Hongtao Liu from comment #8) > > > (In reply to H.J. Lu from comment #7) > > > > Created attachment 60350 [details] > > > > ira: Don't increase callee-saved register cost by 1000x > > > > > > NOTE, r15-1619-g3b9b8d6cfdf593 improved 500.perlbench_r on many different > > > platforms, let me help verify the patch with SPEC2017. > > > > There're 5% regression on alderlake for 511.povray_r. > > With the patch, there're more PUSH/POPs for callee saved registers.(Those > > PUSH/POPs have been eliminated by r15-1619-g3b9b8d6cfdf593) > > We need testcases to show that. Without them, we can't be sure that the > improvement won't go away. I think the testcase in PR111673 demonstrates it int f(int); int advance(int dz) { if (dz > 0) return (dz + dz) * dz; else return dz * f(dz); } Before r15-1619-g3b9b8d6cfdf593 advance(int): push rbx mov ebx, edi test edi, edi jle .L2 imul ebx, edi lea eax, [rbx+rbx] pop rbx ret .L2: call f(int) imul eax, ebx pop rbx ret After advance(int): test edi, edi jle .L2 imul edi, edi lea eax, [rdi+rdi] ret .L2: sub rsp, 24 mov DWORD PTR [rsp+12], edi call f(int) imul eax, DWORD PTR [rsp+12] add rsp, 24 ret Unlike testcase in #c6(call in both if and else branch), there's no call in if branch, it's not optimal to push rbx at the entry of the function, it can be sinked to else branch(as sub + mov). When jle .L2 is not taken, it can save one push instruction. And that's why 511.povray_r is improved.