Pushed as-is, leaving Ben’s preference as an opportunity for future improvement.

  Jarno

On Dec 6, 2013, at 9:24 AM, Ben Pfaff <b...@nicira.com> wrote:

> On Thu, Dec 05, 2013 at 04:36:26PM -0800, Jarno Rajahalme wrote:
>> Inline, use another well-known algorithm for 64-bit builds, and use
>> builtins when they are known to be fast at compile time.  A 32-bit
>> version of the alternate algorithm is slower than the existing
>> implementation, so the old one is used for 32-bit builds.  Inline
>> assembler would be a bit faster on 32-bit i7 build, but we use the GCC
>> builtin for portability.
>> 
>> It should be stressed builds for specific CPUs do not work on others
>> CPUs, and that OVS build system or runtime does not currently support
>> CPU detection.
>> 
>> Speed improvement v.s. existing implementation / GCC 4.7
>> __builtin_popcountll():
>> 
>> i386:         64%  (inlining)                         / 380%
>> i386 on i7:   240% (inlining + builtin)               / 820%
>> x86_64:       59%  (inlining + different algorithm)   / 190%
>> x86_64 on i7: 370% (inlining + builtin)               / 0%
>> 
>> Signed-off-by: Jarno Rajahalme <jrajaha...@nicira.com>
> 
> Instead of defined(__corei7), I would write __POPCNT__, a GCC macro
> specific to popcnt instruction support.  I don't think that __corei7 is
> a good test because it is too specific: successors to Core i7 will
> almost certainly also have POPCNT.
> 

You are absolutely right about this. However, I’m running out of time for today 
and decided to push this now rather than wait.

> Acked-by: Ben Pfaff <b...@nicira.com>

Thanks!
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to