http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51838
Bug #: 51838 Summary: Inefficient add of 128 bit quantity represented as 64 bit tuple to 128 bit integer. Classification: Unclassified Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: c AssignedTo: unassig...@gcc.gnu.org ReportedBy: svfue...@gmail.com void foo(__uint128_t *x, unsigned long long y, unsigned long long z) { *x += y + ((__uint128_t) z << 64); } Compiles into: mov %rdx,%r8 mov %rsi,%rax xor %edx,%edx add (%rdi),%rax mov %rdi,%rcx adc 0x8(%rdi),%rdx xor %esi,%esi add %rsi,%rax adc %r8,%rdx mov %rax,(%rcx) mov %rdx,0x8(%rcx) retq The above can be optimized into: add %rsi, (%rdi) adc %rdx, 8(%rdi) retq