http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45835
Summary: Consider push simm8;pop reg for -Os Product: gcc Version: 4.6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target AssignedTo: unassig...@gcc.gnu.org ReportedBy: ja...@gcc.gnu.org CC: u...@gcc.gnu.org, h...@gcc.gnu.org Target: x86_64-linux http://embed.cs.utah.edu/embarrassing/jan_10/ snippets suggest that for -Os (not sure if just for -m32 or even -m64) icc generates shorter sequences for loading signed 8 bit immediates into registers. movl $1, %eax is 5 bytes long, while pushl $1; popl %eax is 3 byte long for -m32 (and similarly pushq $1; popq %rax for -m64). For r8..r15 push/pop is 4 bytes, while movl is 6 bytes. Not sure about the performance implications and whether it shouldn't be something controllable by some -m* switch for users like Linux kernel which want -Os primarily to improve performance and if the push/pop would be significantly slower they might not appreciate it.