On Tue, Mar 07, 2006 at 01:04:36PM +1100, Nick Piggin wrote:
> I'd say it will turn out to be more trouble than its worth, for the 
> miserly cost
> avoiding one atomic_inc, and one atomic_dec_and_test on page-local data 
> that will
> be in L1 cache. I'd never turn my nose up at anyone just having a go 
> though :)

The cost is anything but miserly.  Consider that every lock instruction is 
a memory barrier which takes your OoO CPU with lots of instructions in flight 
to ramp down to just 1 for the time it takes that instruction to execute.  
That synchronization is what makes the atomic expensive.

In the case of netperf, I ended up with a 2.5Gbit/s (~30%) performance 
improvement through nothing but microoptimizations.  There is method to 
my madness. ;-)

                -ben
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to