On Friday 05 May 2006 10:49, Eric Dumazet wrote: > On a dual opteron box, I noticed high oprofile numbers in net/core/dst.c > , function dst_destroy(struct dst_entry * dst) > > It appears the smb_rmb() done at the begining of dst_destroy() is the > killer (this is a lfence machine instruction, that apparently is doing > a *lot* of things... may be IO related...) that is responsible for 80% > of the cpu time used by the whole function. > > I dont understand very much all variety of available barriers, and why > this smb_rmb() is used in dst_destroy(). > I missed the corresponding wmb that should be done somewhere in the dst > code. > > Do we have an alternative to smp_rmb() in the dst_destroy()/ kfree_skb() > context ?
Eliminating it probably wouldn't help very much - it just flushes the loads already in flight. If it didn't do that the next smp_rmb() would. I'm surprised there are that many though. Normally kernel code is spagetti enough that the CPU cannot speculate too many loads ahead. But are you 100% sure the cost is not in the lock decl ? That would make more sense. Perhaps profile for cache misses too and double check? -Andi - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html