On 23 November 2015 at 10:06, Ben Pfaff <b...@ovn.org> wrote: > When Joe added these types I assumed that he used the unconventional > prototypes for hton128() and ntoh128() because the return value > convention was inefficient. If GCC and Clang actually optimize the use > of a return value in some kind of sensible way then I agree that the > usual convention is nicer. > > Joe, did you have another reason?
This was mostly done based on an assumption that this was more optimal, rather than actually digging into the compiled code and seeing that it was generated differently. Looking now, before this patch vs. after on my 64-bit system.. GCC-4.9: hton128/ntoh128 require one less MOV with this patch, but calling conventions (in format_u128) require 3 extra MOV (+4 MOV, -1 LEA) for format_u128(). Clang-3.7: hton128/ntoh128 are roughly equivalent, although with this patch they use some MOVUPS/MOVAPS instructions for 128-bit moves. Calling conventions seem to require as much as 6-12 (!) extra MOVs however, details below. Clang-3.7 in format_u128(), before: if (verbose || (mask && !ovs_u128_is_zero(mask))) { e99b: f6 45 e7 01 testb $0x1,-0x19(%rbp) e99f: 0f 85 1c 00 00 00 jne e9c1 <format_u128+0x41> e9a5: 48 83 7d e8 00 cmpq $0x0,-0x18(%rbp) e9aa: 0f 84 8d 00 00 00 je ea3d <format_u128+0xbd> e9b0: 48 8b 7d e8 mov -0x18(%rbp),%rdi e9b4: e8 c7 c8 ff ff callq b280 <ovs_u128_is_zero> e9b9: a8 01 test $0x1,%al e9bb: 0f 85 7c 00 00 00 jne ea3d <format_u128+0xbd> e9c1: 48 8d 75 d0 lea -0x30(%rbp),%rsi ovs_be128 value; hton128(key, &value); e9c5: 48 8b 7d f0 mov -0x10(%rbp),%rdi e9c9: e8 82 00 00 00 callq ea50 <hton128> e9ce: b8 10 00 00 00 mov $0x10,%eax e9d3: 89 c2 mov %eax,%edx e9d5: 48 8d 75 d0 lea -0x30(%rbp),%rsi ds_put_hex(ds, &value, sizeof value); e9d9: 48 8b 7d f8 mov -0x8(%rbp),%rdi e9dd: e8 00 00 00 00 callq e9e2 <format_u128+0x62> Clang-3.7, after: if (verbose || (mask && !ovs_u128_is_zero(mask))) { e99b: f6 45 e7 01 testb $0x1,-0x19(%rbp) e99f: 0f 85 1c 00 00 00 jne e9c1 <format_u128+0x41> e9a5: 48 83 7d e8 00 cmpq $0x0,-0x18(%rbp) e9aa: 0f 84 d1 00 00 00 je ea81 <format_u128+0x101> e9b0: 48 8b 7d e8 mov -0x18(%rbp),%rdi e9b4: e8 c7 c8 ff ff callq b280 <ovs_u128_is_zero> e9b9: a8 01 test $0x1,%al e9bb: 0f 85 c0 00 00 00 jne ea81 <format_u128+0x101> ovs_be128 value; value = hton128(*key); e9c1: 48 8b 45 f0 mov -0x10(%rbp),%rax e9c5: 48 8b 38 mov (%rax),%rdi e9c8: 48 8b 70 08 mov 0x8(%rax),%rsi e9cc: e8 bf 00 00 00 callq ea90 <hton128> e9d1: b9 10 00 00 00 mov $0x10,%ecx e9d6: 89 ce mov %ecx,%esi e9d8: 48 8d 7d d0 lea -0x30(%rbp),%rdi e9dc: 48 89 45 c0 mov %rax,-0x40(%rbp) e9e0: 48 89 55 c8 mov %rdx,-0x38(%rbp) e9e4: 48 8b 45 c0 mov -0x40(%rbp),%rax e9e8: 48 89 45 d0 mov %rax,-0x30(%rbp) e9ec: 48 8b 45 c8 mov -0x38(%rbp),%rax e9f0: 48 89 45 d8 mov %rax,-0x28(%rbp) ds_put_hex(ds, &value, sizeof value); e9f4: 48 8b 45 f8 mov -0x8(%rbp),%rax e9f8: 48 89 7d a8 mov %rdi,-0x58(%rbp) e9fc: 48 89 c7 mov %rax,%rdi e9ff: 48 8b 45 a8 mov -0x58(%rbp),%rax ea03: 48 89 75 a0 mov %rsi,-0x60(%rbp) ea07: 48 89 c6 mov %rax,%rsi ea0a: 48 8b 55 a0 mov -0x60(%rbp),%rdx ea0e: e8 00 00 00 00 callq ea13 <format_u128+0x93> _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev