On Tue, 2016-09-13 at 22:13 -0700, Michael Ma wrote: > I don't intend to install multiple qdisc - the only reason that I'm > doing this now is to leverage MQ to workaround the lock contention, > and based on the profile this all worked. However to simplify the way > to setup HTB I wanted to use TXQ to partition HTB classes so that a > HTB class only belongs to one TXQ, which also requires mapping skb to > TXQ using some rules (here I'm using priority but I assume it's > straightforward to use other information such as classid). And the > problem I found here is that when using priority to infer the TXQ so > that queue_mapping is changed, bandwidth is affected significantly - > the only thing I can guess is that due to queue switch, there are more > cache misses assuming processor cores have a static mapping to all the > queues. Any suggestion on what to do next for the investigation? > > I would also guess that this should be a common problem if anyone > wants to use MQ+IFB to workaround the qdisc lock contention on the > receiver side and classful qdisc is used on IFB, but haven't really > found a similar thread here...
But why are you changing the queue ? NIC already does the proper RSS thing, meaning all packets of one flow should land on one RX queue. No need to ' classify yourself and risk lock contention' I use IFB + MQ + netem every day, and it scales to 10 Mpps with no problem. Do you really need to rate limit flows ? Not clear what are your goals, why for example you use HTB to begin with.