I applied these patch series, and either I goofed (possible), or subsequent updates to the various trees since the time it came out and the time I started trying it, broke it again.
It fails on linux 3.0.9, 3.1.3, 3.1.4 with errors applying stuff to various mtd partitions. A typical error (3.0.9) Applying patch platform/416-mtd_api_tl_mr3x20.patch patching file arch/mips/ar71xx/mach-tl-mr3x20.c Hunk #1 FAILED at 34. Hunk #2 FAILED at 61. 2 out of 2 hunks FAILED -- rejects in file arch/mips/ar71xx/mach-tl-mr3x20.c Patch platform/416-mtd_api_tl_mr3x20.patch does not apply (enforce with -f) make[4]: *** [/home/cero1/src/cerowrt/build_dir/linux-ar71xx_generic/linux-3.0.9/.quilt_checked] Error 1 make[4]: Leaving directory `/home/cero1/src/cerowrt/target/linux/ar71xx' make[3]: *** [compile] Error 2 make[3]: Leaving directory `/home/cero1/src/cerowrt/target/linux' make[2]: *** [target/linux/compile] Error 2 make[2]: Leaving directory `/home/cero1/src/cerowrt' make[1]: *** [/home/cero1/src/cerowrt/staging_dir/target-mips_r2_uClibc-0.9.32/stamp/.target_compile] Error 2 make[1]: Leaving directory `/home/cero1/src/cerowrt' make: *** [world] Error 2 On Sun, Nov 27, 2011 at 7:36 PM, Dave Taht <dave.t...@gmail.com> wrote: > On Sun, Nov 27, 2011 at 6:17 PM, Outback Dingo <outbackdi...@gmail.com> wrote: >> On Sun, Nov 27, 2011 at 11:52 AM, Otto Solares Cabrera <so...@guug.org> >> wrote: >>> On Sat, Nov 26, 2011 at 10:37:33PM -0500, Outback Dingo wrote: >>>> On Sat, Nov 26, 2011 at 10:13 PM, Hartmut Knaack <knaac...@gmx.de> wrote: >>>> > This patch brings support for kernel version 3.1 to the ar71xx platform. >>>> > It is based on Otto Estuardo Solares Cabreras linux-3.0 patches, with >>>> > some changes to keep up with recent filename changes in the kernel. >>>> > Minimum kernel version seems to be 3.1.1, otherwise one of the generic >>>> > patches will fail. Successfully tested with kernel 3.1.2 on a WR1043ND. >>>> > Kernel version in the Makefile still needs to be adjusted manually. >>>> >>>> ill get onto testing these also >>> >>> It works for me on the wrt160nl with Linux-3.1.3. Thx Hartmut! >> >> Also working on WNDR3700v2 and a variety of Ubiquiti gear.... nice.... >> Thanks both of you. > > My thanks as well, although I haven't had time to do a build yet. IF > anyone is interested in > byte queue limits, the patches I was attempting to backport to 3.1 > before taking off for the holiday, > including a modified ag71xx driver, are at: > > http://huchra.bufferbloat.net/~cero1/bql/ > > Regettably they didn't quite compile before I left for holiday, and > I'm going to have to rebase cerowrt and rebuild, (I'm still grateful!) > and I figure (hope!) one of you folk will beat me to getting BQL working > before I get back to the office tuesday. > > A plug: > > Byte queue limits hold great promise for beating bufferbloat, and getting > tc's shapers and schedulers to work properly again, at least > on ethernet. > > Byte Queue limits, by holding down the amount of outstanding data that > the device driver > has in it, all the QoS and shaping tools that we know and love finally > get a chance to work again. You can retain high hw tx queue rings - so, as > an example, you could have a 6k byte queue limit and 4 large packets > in the buffer, > or 93 ack packets in the buffer - and this let you manage the bandwidth via > tools higher in the stack, as either take about the same amount of > time to transmit, > without compromising line level performance... > > The current situation is: we often have hw tx rings of 64 or higher, > which translates out to > 96k in flight, meaning that (as already demonstrated) with this patch working, > you can improve network responsiveness by a factor of at least ten, perhaps as > much as 100. (TCP's response to buffering is quadratic, not linear, > but there are other > variables, so... factor 10 sounds good, doesn't it?) > > From Tom Herbert's announcement (there was much feedback on netdev, I > would expect > another revision to come by) > > > Changes from last version: > - Rebase to 3.2 > - Added CONFIG_BQL and CONFIG_DQL > - Added some cache alignment in struct dql, to split read only, writeable > elements, and split those elements written on transmit from those > written at transmit completion (suggested by Eric). > - Split out adding xps_queue_release as its own patch. > - Some minor performance changes, use likely and unlikely for some > conditionals. > - Cleaned up some "show" functions for bql (pointed out by Ben). > - Change netdev_tx_completed_queue to do check xoff, check > availability, and then check xoff again. This to prevent potential > race conditions with netdev_sent_queue (as Ben pointed out). > - Did some more testing trying to evaluate overhead of BQL in the > transmit path. I see about 1-3% degradation in CPU utilization > and maximum pps when BQL is enabled. Any ideas to beat this > down as much as possible would be appreciated! > - Added high versus low priority traffic test to results below. > > ---- > > This patch series implements byte queue limits (bql) for NIC TX queues. > > Byte queue limits are a mechanism to limit the size of the transmit > hardware queue on a NIC by number of bytes. The goal of these byte > limits is too reduce latency (HOL blocking) caused by excessive queuing > in hardware (aka buffer bloat) without sacrificing throughput. > > Hardware queuing limits are typically specified in terms of a number > hardware descriptors, each of which has a variable size. The variability > of the size of individual queued items can have a very wide range. For > instance with the e1000 NIC the size could range from 64 bytes to 4K > (with TSO enabled). This variability makes it next to impossible to > choose a single queue limit that prevents starvation and provides lowest > possible latency. > > The objective of byte queue limits is to set the limit to be the > minimum needed to prevent starvation between successive transmissions to > the hardware. The latency between two transmissions can be variable in a > system. It is dependent on interrupt frequency, NAPI polling latencies, > scheduling of the queuing discipline, lock contention, etc. Therefore we > propose that byte queue limits should be dynamic and change in > accordance with networking stack latencies a system encounters. BQL > should not need to take the underlying link speed as input, it should > automatically adjust to whatever the speed is (even if that in itself is > dynamic). > > Patches to implement this: > - Dynamic queue limits (dql) library. This provides the general > queuing algorithm. > - netdev changes that use dlq to support byte queue limits. > - Support in drivers for byte queue limits. > > The effects of BQL are demonstrated in the benchmark results below. > > --- High priority versus low priority traffic: > > In this test 100 netperf TCP_STREAMs were started to saturate the link. > A single instance of a netperf TCP_RR was run with high priority set. > Queuing discipline in pfifo_fast, NIC is e1000 with TX ring size set to > 1024. tps for the high priority RR is listed. > > No BQL, tso on: 3000-3200K bytes in queue: 36 tps > BQL, tso on: 156-194K bytes in queue, 535 tps > No BQL, tso off: 453-454K bytes int queue, 234 tps > BQL, tso off: 66K bytes in queue, 914 tps > > --- Various RR sizes > > These tests were done running 200 stream of netperf RR tests. The > results demonstrate the reduction in queuing and also illustrates > the overhead due to BQL (in small RR sizes). > > 140000 rr size > BQL: 80-215K bytes in queue, 856 tps, 3.26% > No BQL: 2700-2930K bytes in queue, 854 tps, 3.71% cpu > > 14000 rr size > BQL: 25-55K bytes in queue, 8500 tps > No BQL: 1500-1622K bytes in queue, 8523 tps, 4.53% cpu > > 1400 rr size > BQL: 20-38K in queue bytes in queue, 86582 tps, 7.38% cpu > No BQL: 29-117K 85738 tps, 7.67% cpu > > 140 rr size > BQL: 1-10K bytes in queue, 320540 tps, 34.6% cpu > No BQL: 1-13K bytes in queue, 323158, 37.16% cpu > > 1 rr size > BQL: 0-3K in queue, 338811 tps, 41.41% cpu > No BQL: 0-3K in queue, 339947 42.36% cpu > > So the amount of queuing in the NIC can be reduced up to 90% or more. > Accordingly, the latency for high priority packets in the prescence > of low priority bulk throughput traffic can be reduced by 90% or more. > > Since BQL accounting is in the transmit path for every packet, and the > function to recompute the byte limit is run once per transmit > completion-- there will be some overhead in using BQL. So far, Ive see > the overhead to be in the range of 1-3% for CPU utilization and maximum > pps. > > > > > -- > Dave Täht > SKYPE: davetaht > US Tel: 1-239-829-5608 > FR Tel: 0638645374 > http://www.bufferbloat.net -- Dave Täht SKYPE: davetaht US Tel: 1-239-829-5608 FR Tel: 0638645374 http://www.bufferbloat.net _______________________________________________ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/mailman/listinfo/openwrt-devel