On 16-11-14 03:06 PM, Michael S. Tsirkin wrote: > On Thu, Nov 10, 2016 at 08:44:32PM -0800, John Fastabend wrote: >> Signed-off-by: John Fastabend <john.r.fastab...@intel.com> > > This will naturally reduce the cache line bounce > costs, but so will a _many API for ptr-ring, > doing lock-add many-unlock. > > the number of atomics also scales better with the lock: > one per push instead of one per queue. > > Also, when can qdisc use a _many operation? >
On dequeue we can pull off many skbs instead of one at a time and then either (a) pass them down as an array to the driver (I started to write this on top of ixgbe and it seems like a win) or (b) pass them one by one down to the driver and set the xmit_more bit correctly. The pass one by one also seems like a win because we avoid the lock per skb. On enqueue qdisc side its a bit more evasive to start doing this. [...] >> +++ b/net/sched/sch_generic.c >> @@ -571,7 +571,7 @@ static int pfifo_fast_enqueue(struct sk_buff *skb, >> struct Qdisc *qdisc, >> struct skb_array_ll *q = band2list(priv, band); >> int err; >> >> - err = skb_array_ll_produce(q, skb); >> + err = skb_array_ll_produce(q, &skb); >> >> if (unlikely(err)) { >> net_warn_ratelimited("drop a packet from fast enqueue\n"); > > I don't see a pop many operation here. > Patches need a bit of cleanup looks like it was part of another patch. .John