Hi Eric, On Fri, 2016-11-25 at 14:30 -0800, Eric Dumazet wrote: > On Fri, 2016-11-25 at 16:39 +0100, Paolo Abeni wrote: > > _skb_try_recv_datagram_batch dequeues multiple skb's from the > > socket's receive queue, and runs the bulk_destructor callback under > > the receive queue lock. > > ... > > > + last = (struct sk_buff *)queue; > > + first = (struct sk_buff *)queue->next; > > + skb_queue_walk(queue, skb) { > > + last = skb; > > + totalsize += skb->truesize; > > + if (++datagrams == batch) > > + break; > > + } > > This is absolutely not good. > > Walking through a list, bringing 2 cache lines per skb, is not the > proper way to deal with bulking. > > And I do not see where 'batch' value coming from user space is capped ? > > Is it really vlen argument coming from recvmmsg() system call ??? > > This code runs with BH masked, so you do not want to give user a way to > make you loop there 1000 times > > Bulking is nice, only if you do not compromise with system stability and > latency requirements from other users/applications.
Thank you for reviewing this. You are right, the cacheline miss while accessing skb->truesize has measurable performance impact, and the max burst length comes in from recvmmsg(). We can easily cap the burst to some max value (e.g. 64 or less) and we can pass to the bulk destructor the skb list and burst length without accessing skb truesize beforehand. If the burst is short, say 8 skbs or less, the bulk destructor walk again the list and release the memory, elsewhere it defers the release after __skb_try_recv_datagram_batch() completion: we walk the list without the lock held and we acquire it later again to release all the memory. Thank you again, Paolo