On Tue, Mar 07, 2006 at 01:04:36PM +1100, Nick Piggin wrote: > I'd say it will turn out to be more trouble than its worth, for the > miserly cost > avoiding one atomic_inc, and one atomic_dec_and_test on page-local data > that will > be in L1 cache. I'd never turn my nose up at anyone just having a go > though :)
The cost is anything but miserly. Consider that every lock instruction is a memory barrier which takes your OoO CPU with lots of instructions in flight to ramp down to just 1 for the time it takes that instruction to execute. That synchronization is what makes the atomic expensive. In the case of netperf, I ended up with a 2.5Gbit/s (~30%) performance improvement through nothing but microoptimizations. There is method to my madness. ;-) -ben - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html