Alan Cox wrote:
On Sul, 2005-09-04 at 11:01 +1000, Nick Piggin wrote:
I would be surprised if it was a big loss... but I'm assuming
a locked cmpxchg isn't outlandishly expensive. Basically:
read_lock_irqsave(cacheline1);
atomic_inc_return(cacheline2);
read_unlock_irqrestore(cacheline1);
Turns into
atomic_cmpxchg();
I'll do some microbenchmarks and get back to you. I'm quite
interested now ;) What sort of AMDs did you have in mind,
Athlon or higher give very different atomic numbers to P4. If you are
losing the read_lock/unlock then the atomic_cmpxchg should be faster on
all I agree.
Phew! I'll test them anyway, however.
One question however - atomic_foo operations are not store barriers so
you might need mb() and friends for PPC ?
Dave's documented that nicely in Documentation/atomic_ops.txt
In general, atomic ops that do not return a value are not barriers,
while operations that do return a value are.
So I think we can define the atomic_cmpxchg as providing a barrier.
Thanks,
Nick
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/