On Tue, Jul 31, 2012 at 10:38:21AM -0700, Ethan Jackson wrote:
> How performance critical is this popcount implementation going to be?
> I assume you've put all this work into testing it because the
> classifier will be relying on it heavily?

Yes, I think it's going to be at least fairly common in the
classifier.  I didn't measure that yet, because I think that there are
opportunities to avoid some of them.

> Why do you think the gcc builtin is slow? That's surprising to me.  Is
> it possible that in newer versions of gcc (i.e. 4.7 and later) would
> simply generate the assembly instruction?

The GCC builtin is portable.  I guess it's the same code as popcount4,
since they run at the same speed.

The assembly instruction isn't portable.  It isn't an architectural
instruction, that is, you can't rely on say, anything newer than Core
2 to have it.  There is a separate CPU feature bit for it that you
need to check before using it.  So my guess is that GCC will never
generate it, even in the future, without some kind of specific
compiler option that says "CPU has popcnt instruction".

> If it's so performance critical, could we simply check for the
> assembly instruction in the configure script, and if it exists use it.
>  Of course, if it doesn't exist we would fall back to what you
> currently have.

Configure time wouldn't be good enough, because we need to know about
the machine we're going to run on, not the one that we're building
on.  We'd have to check at runtime instead.
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to