Ben, I forgot about that mail over the weekend, sorry. IO just caked your patches.
Jarno On Nov 18, 2013, at 1:20 PM, Ben Pfaff <b...@nicira.com> wrote: > I guess you missed this. No matter, I sent out a couple of patches: > http://openvswitch.org/pipermail/dev/2013-November/034035.html > http://openvswitch.org/pipermail/dev/2013-November/034036.html > > > On Fri, Nov 15, 2013 at 4:06 PM, Ben Pfaff <b...@nicira.com> wrote: > On Fri, Nov 15, 2013 at 03:48:58PM -0800, Ben Pfaff wrote: > > On Wed, Nov 13, 2013 at 01:32:42PM -0800, Jarno Rajahalme wrote: > > > Add raw_ctz64(), ctz64(), and popcount64() using builtins when > > > available. > > > > > > I'm not sure if the "UINT64_MAX == ~0UL" and "UINT64_MAX == ~0ULL" > > > work in all cases as I imagine they would. > > > > I think you could use ULONG_MAX and ULLONG_MAX, any reason not to? > > > > > Signed-off-by: Jarno Rajahalme <jrajaha...@nicira.com> > > I had another thought. > > It is really convenient how rightmost_1bit() and zero_rightmost_1bit() > work with any width integer. It would be nice if we could just make > ctz() and raw_ctz() do the same. > > I was able to make rightmost_1bit() and zero_rightmost_1bit() work > that way because when I did some tests with GCC, I found that GCC was > smart enough to compile the code with the cheapest instructions. That > is, if you pass in a uint32_t it didn't bother to do 64-bit > arithmetic. > > I see that GCC isn't smart enough to do that with __builtin_ctzll(), > but you can fake it with a small amount of manual effort. That is, > the following C file: > > #include <stdint.h> > > static inline int ctz(unsigned long long int x) > { > if (__builtin_constant_p(x <= UINT32_MAX) && x <= UINT32_MAX) { > return __builtin_ctz(x); > } else { > return __builtin_ctzll(x); > } > } > > int ctzl(unsigned long int x) > { > return ctz(x); > } > > int ctzll(unsigned long long int x) > { > return ctz(x); > } > > compiles to this on 32-bit: > > 00000000 <ctzl>: > 0: 0f bc 44 24 04 bsf 0x4(%esp),%eax > 5: c3 ret > 6: 8d 76 00 lea 0x0(%esi),%esi > 9: 8d bc 27 00 00 00 00 lea 0x0(%edi,%eiz,1),%edi > > 00000010 <ctzll>: > 10: 83 ec 1c sub $0x1c,%esp > 13: 8b 44 24 20 mov 0x20(%esp),%eax > 17: 8b 54 24 24 mov 0x24(%esp),%edx > 1b: 89 04 24 mov %eax,(%esp) > 1e: 89 54 24 04 mov %edx,0x4(%esp) > 22: e8 fc ff ff ff call 23 <ctzll+0x13> > 23: R_386_PC32 __ctzdi2 > 27: 83 c4 1c add $0x1c,%esp > 2a: c3 ret > > and to this on 64-bit: > > 0000000000000000 <ctzl>: > 0: 48 0f bc c7 bsf %rdi,%rax > 4: c3 retq > 5: 66 66 2e 0f 1f 84 00 data32 nopw %cs:0x0(%rax,%rax,1) > c: 00 00 00 00 > > 0000000000000010 <ctzll>: > 10: 48 0f bc c7 bsf %rdi,%rax > 14: c3 retq > > which is just about perfect. > > What do you think? > > Thanks, > > Ben. > > > > -- > "I don't normally do acked-by's. I think it's my way of avoiding > getting blamed when it all blows up." Andrew Morton
_______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev