On Wed, 2006-08-09 at 15:48 +0200, Indan Zupancic wrote: > On Wed, August 9, 2006 14:54, Peter Zijlstra said: > > On Wed, 2006-08-09 at 14:02 +0200, Indan Zupancic wrote: > >> That avoids lots of checks and should guarantee that the > >> accounting is correct, except in the case when the IFF_MEMALLOC flag is > >> cleared and the counter is set to zero manually. Can't that be avoided and > >> just let it decrease to zero naturally? > > > > That would put the atomic op on the free path unconditionally, I think > > davem gets nightmares from that. > > I confused SOCK_MEMALLOC with sk_buff::memalloc, sorry. What I meant was > to unconditionally decrement the reserved usage only when memalloc is true > on the free path. That way all skbs that increased the reserve also decrease > it, and the counter should never go below zero.
OK, so far so good, except we loose the notion of getting memory back from regular skbs. > Also as far as I can see it should be possible to replace all atomic > "if (unlikely(dev_reserve_used(skb->dev)))" checks witha check if > memalloc is set. That should make davem happy, as there aren't any > atomic instructions left in hot paths. dev_reserve_used() uses atomic_read() which isn't actually a LOCK'ed instruction, so that should not matter. > If IFF_MEMALLOC is set new skbs set memalloc and increase the reserve. Not quite, if IFF_MEMALLOC is set new skbs _could_ get memalloc set. We only fall back to alloc_pages() if the regular path fails to alloc. If the skb is backed by a page (as opposed to kmem_cache fluff) sk_buff::memalloc is set. > When the skb is being destroyed it doesn't matter if IFF_MEMALLOC is set > or not, only if that skb used reserves and thus only the memalloc flag > needs to be checked. This means that changing the IFF_MEMALLOC doesn't > affect in-flight skbs but only newly created ones, and there's no need to > update in-flight skbs whenever the flag is changed as all should go well. Your reasoning is sound, except for these two point above... <snip code> > It seems that here SOCK_MEMALLOC is only set on the first socket. > Shouldn't it be set on all sockets instead? Ouch, where was my brain... Thanks! Also, I've been thinking (more pain), should I not up the reserve for each SOCK_MEMALLOC socket. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html