2017-04-18 16:12 GMT-07:00 Cong Wang <xiyou.wangc...@gmail.com>: > On Mon, Apr 17, 2017 at 5:39 PM, Michael Ma <make0...@gmail.com> wrote: >> Hi - >> >> We've implemented a "glue" qdisc similar to mqprio which can associate >> one qdisc to multiple txqs as the root qdisc. Reference count of the >> child qdiscs have been adjusted properly in this case so that it >> represents the number of txqs it has been attached to. However when >> sending packets we saw the skb from dequeue_skb() corrupted with the >> following call stack: >> >> [exception RIP: netif_skb_features+51] >> RIP: ffffffff815292b3 RSP: ffff8817f6987940 RFLAGS: 00010246 >> >> #9 [ffff8817f6987968] validate_xmit_skb at ffffffff815294aa >> #10 [ffff8817f69879a0] validate_xmit_skb at ffffffff8152a0d9 >> #11 [ffff8817f69879b0] __qdisc_run at ffffffff8154a193 >> #12 [ffff8817f6987a00] dev_queue_xmit at ffffffff81529e03 >> >> It looks like the skb has already been released since its dev pointer >> field is invalid. >> >> Any clue on how this can be investigated further? My current thought >> is to add some instrumentation to the place where skb is released and >> analyze whether there is any race condition happening there. However > > Either dropwatch or perf could do the work to instrument kfree_skb().
Thanks - will try it out. > >> by looking through the existing code I think the case where one root >> qdisc is associated with multiple txqs already exists (when mqprio is >> not used) so not sure why it won't work when we group txqs and assign >> each group a root qdisc. Any insight on this issue would be much >> appreciated! > > How do you implement ->attach()? How does it work with netdev_pick_tx()? attach() essentially grafts the default qdisc(pfifo) to each "txq group" represented by a TC class. For netdev_pick_txq() we use classid of the socket to select a class based on a "class id base" and the class to txq mapping defined together with this glue qdisc - it's pretty much the same as mqprio with the difference of mapping one class to multiple txqs and selecting the txq through a hash.