On 12/06/17 21:00, Toke Høiland-Jørgensen wrote:
p.wa...@gmx.at writes:

My SQM configuration was basically just using cake + piece_of_cake.qos,
but that's clearly off topic for now. (I'm also CC'ing this mail to Toke,
the maintainer of sqm-scripts).

If you're crashing the box my guess would be there's a bug in the cake
qdisc somewhere. What happens if you run SQM with fq_codel instead?

-Toke

This isn't the first time I've heard cake implicated in cpu stalls but trying to discern a signal in some of the noise is difficult.

Using 'fq_codel' would be a good first elimination round.

For 2nd round elimination: Cake is the only qdisc to my knowledge that pulls apart large 'GSO' (Generic segmentation offload) packets prior to sending them up the stack, a process cake calls 'peeling'. It does this to retain control on how to schedule a (up to 64K) 'super packet', breaking it up into a series of 1500 byte packets instead. Some have reported 'messing with ethtool' to disable GSO as being helpful. I know not how 'ethtool' works.

Whether this is a bug in the cake peeling code, network interface driver is unclear, and again anecdotal evidence suggests this is only seen on multi-cpu systems. I dread to think what happens if one cpu starts 'grabbing one of those large skbs' for sending purposes, whilst another (in cake) is busy breaking it apart, or indeed if that scenario is possible.

Some ideas/thoughts/things to try :-)  Apologies for the continuing hijack.

KDB

_______________________________________________
Lede-dev mailing list
Lede-dev@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/lede-dev

Reply via email to