On Fri, Aug 22, 2014 at 01:58:21PM -0700, Jarno Rajahalme wrote: > The atomics here do not synchronize the state of any other variables, > so we can use relaxed atomics. > > cfm_should_process_flow() is rearranged to set the megaflow mask bits > only if necessary, and to avoid the atomic operation on non-BFD > packets. > > Signed-off-by: Jarno Rajahalme <jrajaha...@nicira.com> > + /* Most packets are not CFM. */ > + if (OVS_LIKELY(ntohs(flow->dl_type) != ETH_TYPE_CFM)) { > + return false; > + }
The common recommendation is to htons the constant instead of ntohsing the variable, to generate better code. Out of curiosity, I decided to find out whether this is still needed with modern compilers, so I built the following with GCC and Clang: #include <netinet/in.h> #include <stdbool.h> bool is_ip(unsigned short int x) { return ntohs(x) == 0x0800; } bool is_ip2(unsigned short int x) { return x == htons(0x0800); } With GCC 4.7.2: 00000000 <is_ip>: 0: 0f b7 44 24 04 movzwl 0x4(%esp),%eax 5: 66 c1 c8 08 ror $0x8,%ax 9: 66 3d 00 08 cmp $0x800,%ax d: 0f 94 c0 sete %al 10: c3 ret 00000020 <is_ip2>: 20: 66 83 7c 24 04 08 cmpw $0x8,0x4(%esp) 26: 0f 94 c0 sete %al 29: c3 ret With Clang 3.5.0: 00000000 <is_ip>: 0: 66 8b 44 24 04 mov 0x4(%esp),%ax 5: 66 c1 c0 08 rol $0x8,%ax 9: 0f b7 c0 movzwl %ax,%eax c: 3d 00 08 00 00 cmp $0x800,%eax 11: 0f 94 c0 sete %al 14: 0f b6 c0 movzbl %al,%eax 17: c3 ret 00000020 <is_ip2>: 20: 0f b7 44 24 04 movzwl 0x4(%esp),%eax 25: 83 f8 08 cmp $0x8,%eax 28: 0f 94 c0 sete %al 2b: 0f b6 c0 movzbl %al,%eax 2e: c3 ret so it's still worthwhile. (I was surprised to see Clang generate worse code overall. It seems to regard a bool as a 32-bit value, at least for the purpose of returning from a function.) > + atomic_read_relaxed(&cfm->check_tnl_key, &check_tnl_key); > + > + /* Atomic read is always a memory access, do some work > + * before we need the result. */ I believe an atomic_read_relaxed() is just a compiler barrier. That means that unless cfm_should_process_flow() is getting inlined somewhere (I doubt it--it has no callers within cfm.c), there isn't any way that the memory access could be optimized out anyway. So, this optimization and comment seems a little silly to me. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev