After far too long, it looks like I’ll have the opportunity to work on sch_cake 
a bit more.  So here’s a little bit of a “state of the union” speech about what 
we’ve got and what I’m planing to add to it.

So far we’ve got a deficit-mode, non-bursting shaper that works pretty well, 
and an integrated implementation of fq_codel that tunes itself (that is, the 
target delay) to the bandwidth set on the shaper.  The configuration is “as 
easy as cake”; the intention is that you can just specify one parameter (the 
bandwidth to shape at) and leave everything else at the defaults; there simply 
aren’t very many visible knobs, because they aren’t needed.

We’ve also got Diffserv classification, and that part hasn’t been so 
successful.  Each class grabs all traffic with some subset of the codepoints, 
and stuffs them into a separate shaper+fq_codel instance, and the 
higher-priority shapers steal bandwidth from the lower ones to enforce 
priority.  High-priority classes can only use a limited amount of bandwidth, 
exactly as specified in generic Diffserv PHBs.

It works, perfectly as designed, but the resulting behaviour isn’t particularly 
desirable from an end-user perspective.  In particular, people run tests using 
best-effort traffic to see how much bandwidth they’re getting, resulting in 
complaints that cake had to be given a bigger number to get the correct 
throughput - which of course also stops it from functioning correctly when 
background traffic is added to the mix.  So that needed a rethink.

Incidentally, the existing Diffserv implementation can be disabled by 
specifying the “besteffort” keyword.  This lumps all traffic into a single 
class, handled by a single shaper at the configured rate.  Cake already works 
pretty well in that mode; sometimes I turn the shaper down to analogue-modem 
speeds and note, with some satisfaction, that everything *still* works.  Except 
YouTube, but that’s only because streaming video really does need more than 
analogue-modem bandwidth.

As for performance, I’m able to make my ancient Pentium-MMX shape at over 50 
Mbps, summing traffic in both directions between two bridged Fast Ethernet 
cards.  This limitation is probably a combination of timer latency and 
context-switch overhead.  I don’t expect it to improve much, unless we find a 
way to seriously reduce those overheads (which are already quite low for a 
modern desktop OS).  A faster machine with better timers gets better 
performance, of course.

So there are two big things I want to change in the next version:

The easy part (at least in terms of how many unknowns there are) is adjusting 
the flow-queueing part so that it uses set-associative hashing instead of 
straight hashing when selecting a queue.  This should reduce the incidence of 
hash collisions considerably for a given number of flow queues, or conversely 
provide equivalent collision performance with a smaller number of queues.

The more interesting part is to rework the Diffserv prioritiser so that it 
behaves more usefully.  I think I’ve hit upon the right idea which should make 
this work in practice - instead of individually hard-shaping each class, 
instead use the shaper logic as a threshold function between high and low 
priority, and instead implement a single shaper to handle all traffic.  The 
priority function can then be handled by a weighted DRR system - which is 
already in place, but doesn’t do much - with just that small modification for 
changing the weights based on the shaper state.

So high-priority traffic gets high priority - but only if it limits itself to a 
reasonable bandwidth.  Above that bandwidth, it gets low priority, but is still 
able to use the full shaped bandwidth if nobody else contends for it.  And 
(unlike say HFSC) we need precisely two parameters per class to do this, both 
specified as ratios rather than hard bandwidth numbers: a bandwidth share 
(which determines both the shaper setting and the low-priority-mode DRR 
weighting) and a priority factor (which determines the high-priority-mode DRR 
weighting).  So if those knobs end up being exposed to userspace, they’ll be 
easier to understand and thus use correctly.

All of this feeds my main goal with Diffserv, which is to start giving 
applications natural incentives to mark their traffic appropriately.  Each 
class has both an advantage, and a tradeoff which must be accepted to realise 
that advantage.  If you need absolutely minimal latency, you can choose a 
high-priority class, but you’ll have to be frugal about bandwidth.  If you need 
maximum throughput, you’ll have to put up with reduced priority compared to 
latency-sensitive traffic.  And if you want to be altruistic, you can choose to 
mark your stuff as bulk, background traffic, and it’ll be treated accordingly.  
All of this is in accordance with existing RFCs.

A small caveat: cake is not designed for wifi.  It’s designed for links that 
can at least be treated as full-duplex to a close approximation.  Shared-medium 
links *can* behave like that, if they’re shaped to a miserly enough degree, but 
we really need something different for wifi - although several of cake’s 
components and ideas could be used in such a qdisc.

Roll on cake3.

 - Jonathan Morton

_______________________________________________
Cerowrt-devel mailing list
Cerowrt-devel@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cerowrt-devel

Reply via email to