https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69986
Bug ID: 69986 Summary: smaller code possible with -Os by using push/pop to spill/reload Product: gcc Version: 5.3.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: minor Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: peter at cordes dot ca Target Milestone: --- Target: x86-64-*-* #include <unistd.h> int f(int a) { close(a); return a; } push rbx mov ebx,edi call 400490 <close@plt> mov eax,ebx pop rbx ret with gcc 5.3 -Os. It could be smaller: push rbi call 400490 <close@plt> pop rax ret saving 4 bytes (mov reg,reg is two bytes). More generally, push/pop are 1 byte each, much smaller than mov [rsp-8], edi or something. This might not be a desirable optimization, though, because a round-trip through memory increases latency. It's one of those code-size optimizations that will might often have a negative impact on performance in the case where the function is already hot in L1 I-cache. It would be nice if there was a way to optimize a bit for code-size without making bad performance sacrifices, and also another option to optimize for code size without much regard for performance. -Oss vs. -Os? Or -OS? I assume tuning these options is a lot of work.