------- Comment #3 from navin dot kumar at gmail dot com 2010-08-07 02:27 ------- The poor optimization does seem to stem from multiple-inheritance (and gcc trying to preserve nulls across casts). But it's not just upcast; even with downcasts slow assembly is generated. Take this example:
Base2* fooA(Derived* x) { Base2& y = *x; return &y; } Base2* fooB(Derived* x) { Derived& x2 = *x; Base2& y = x2; return &y; } Both fooA and fooB are funtionally identical. Yet the assembly generated for fooA is: leaq 4(%rdi), %rdx xorl %eax, %eax testq %rdi, %rdi cmovne %rdx, %rax ret and the assembly generated for fooB is: leaq 4(%rdi), %rax ret -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45221