On Tue, 2007-09-04 at 18:25 +0200, Patrick McHardy wrote: > Jesper Dangaard Brouer wrote: > > On Sun, 2007-09-02 at 23:16 +0200, Patrick McHardy wrote: > > > >>Jesper Dangaard Brouer wrote: > >> > >>>On Sun, 2 Sep 2007, Patrick McHardy wrote: > >>> > >>>Lets focus on the general case, where the functionality actually is > >>>needed right away. > >>> > >>>In the general case: > >>> > >>>- The rate table needs to be aligned (cell_align=-1). > >>> (currently, we miscalculates up to 7 bytes on every lookup) > >> > >>We will always do that, thats a consequence of storing the > >>transmission times for multiples of 8b. > > > > > > The issue is that we use the lower boundary for calculating the transmit > > cost. Thus, a 15 bytes packet only have a transmit cost of 8 bytes. > > I believe this is something that should be fixed anyway, > its better to overestimate than underestimate to stay > in control of the queue.
Well, I have attached a patch that uses the upper boundry instead. The patch uses the cell_align feature. The patch is very simple it self, but figure out what happens the rtab array requires a little illustration: Illustrating the rate table array: Legend description rtab[x] : Array index x of rtab[x] xmit_sz : Transmit size contained in rtab[x] (normal transmit time) maps[a-b] : Packet sizes from a to b, will map into rtab[x] Current/old rate table mapping (cell_log:3): rtab[0]:=xmit_sz:0 maps[0-7] rtab[1]:=xmit_sz:8 maps[8-15] rtab[2]:=xmit_sz:16 maps[16-23] rtab[3]:=xmit_sz:24 maps[24-31] rtab[4]:=xmit_sz:32 maps[32-39] rtab[5]:=xmit_sz:40 maps[40-47] rtab[6]:=xmit_sz:48 maps[48-55] New rate table mapping, with kernel cell_align support. rtab[0]:=xmit_sz:8 maps[0-8] rtab[1]:=xmit_sz:16 maps[9-16] rtab[2]:=xmit_sz:24 maps[17-24] rtab[3]:=xmit_sz:32 maps[25-32] rtab[4]:=xmit_sz:40 maps[33-40] rtab[5]:=xmit_sz:48 maps[41-48] rtab[6]:=xmit_sz:56 maps[49-56] New TC util on a kernel WITHOUT support for cell_align rtab[0]:=xmit_sz:8 maps[0-7] rtab[1]:=xmit_sz:16 maps[8-15] rtab[2]:=xmit_sz:24 maps[16-23] rtab[3]:=xmit_sz:32 maps[24-31] rtab[4]:=xmit_sz:40 maps[32-39] rtab[5]:=xmit_sz:48 maps[40-47] rtab[6]:=xmit_sz:56 maps[48-55] Notice that without the kernel cell_align feature, we are only off by one byte. That should be acceptable, when somebody uses a new TC util on a old kernel. > We could additionally make the > rate tables more finegrained (optionally). That is actually already possible with the approach used to handle overflow of the rate table ("TSO" large packet support). By setting cell_log=0, and letting the overflow code handle the rest, we get a very fingrained lookup. > >>>- The existing tc overhead calc can be made more accurate. > >>> (by adding overhead before doing the lookup, instead of the > >>> current solution where the rate table is modified with its > >>> limited resolution) > >> > >>Please demonstrate this with patches (one for the overhead > >>calculation, one for the cell_align thing), then we can > >>continue this discussion. > > > > > > I have attached a patch for the overhead calculation. Attached is a patch that uses "the cell_align thing". > Thanks, I probably won't get to looking into this until > after the netfilter workshop next week. Okay, but I'll see you at the workshop, so I might bug you there ;-) -- Med venlig hilsen / Best regards Jesper Brouer ComX Networks A/S Linux Network developer Cand. Scient Datalog / MSc. Author of http://adsl-optimizer.dk
commit 9a21e8bd56a5f057fc9f605e061c22d264ec27ef Author: Jesper Dangaard Brouer <[EMAIL PROTECTED]> Date: Wed Sep 5 15:24:51 2007 +0200 [IPROUTE2]: Change the rate table calc of transmit cost to use upper bound value. Patrick McHardy, Cite: 'its better to overestimate than underestimate to stay in control of the queue'. Illustrating the rate table array: Legend description rtab[x] : Array index x of rtab[x] xmit_sz : Transmit size contained in rtab[x] (normally transmit time) maps[a-b] : Packet sizes from a to b, will map into rtab[x] Current/old rate table mapping (cell_log:3): rtab[0]:=xmit_sz:0 maps[0-7] rtab[1]:=xmit_sz:8 maps[8-15] rtab[2]:=xmit_sz:16 maps[16-23] rtab[3]:=xmit_sz:24 maps[24-31] rtab[4]:=xmit_sz:32 maps[32-39] rtab[5]:=xmit_sz:40 maps[40-47] rtab[6]:=xmit_sz:48 maps[48-55] New rate table mapping, with kernel cell_align support. rtab[0]:=xmit_sz:8 maps[0-8] rtab[1]:=xmit_sz:16 maps[9-16] rtab[2]:=xmit_sz:24 maps[17-24] rtab[3]:=xmit_sz:32 maps[25-32] rtab[4]:=xmit_sz:40 maps[33-40] rtab[5]:=xmit_sz:48 maps[41-48] rtab[6]:=xmit_sz:56 maps[49-56] New TC util on a kernel WITHOUT support for cell_align rtab[0]:=xmit_sz:8 maps[0-7] rtab[1]:=xmit_sz:16 maps[8-15] rtab[2]:=xmit_sz:24 maps[16-23] rtab[3]:=xmit_sz:32 maps[24-31] rtab[4]:=xmit_sz:40 maps[32-39] rtab[5]:=xmit_sz:48 maps[40-47] rtab[6]:=xmit_sz:56 maps[48-55] Signed-off-by: Jesper Dangaard Brouer <[EMAIL PROTECTED]> diff --git a/tc/tc_core.c b/tc/tc_core.c index c713a18..752b07c 100644 --- a/tc/tc_core.c +++ b/tc/tc_core.c @@ -84,11 +84,12 @@ int tc_calc_rtable(struct tc_ratespec *r, __u32 *rtab, int cell_log, unsigned mt cell_log++; } for (i=0; i<256; i++) { - unsigned sz = (i<<cell_log); + unsigned sz = ((i+1)<<cell_log); if (sz < mpu) sz = mpu; rtab[i] = tc_calc_xmittime(bps, sz); } + r->cell_align=-1; // Due to the sz calc r->cell_log=cell_log; return cell_log; }
commit 29044ac37e30d9662ad1bb83290a007c492ad7b2 Author: Jesper Dangaard Brouer <[EMAIL PROTECTED]> Date: Wed Sep 5 10:47:47 2007 +0200 [IPROUTE2]: Cleanup: tc_calc_rtable(). Change tc_calc_rtable() to take a tc_ratespec struct as an argument. (cell_log still needs to be passed on as a parameter, because -1 indicate that the cell_log needs to be computed by the function.). Signed-off-by: Jesper Dangaard Brouer <[EMAIL PROTECTED]> diff --git a/tc/m_police.c b/tc/m_police.c index 5d2528b..acdfd22 100644 --- a/tc/m_police.c +++ b/tc/m_police.c @@ -263,22 +263,20 @@ int act_parse_police(struct action_util *a,int *argc_p, char ***argv_p, int tca_ } if (p.rate.rate) { - if ((Rcell_log = tc_calc_rtable(p.rate.rate, rtab, Rcell_log, mtu, mpu)) < 0) { + p.rate.mpu = mpu; + if (tc_calc_rtable(&p.rate, rtab, Rcell_log, mtu) < 0) { fprintf(stderr, "TBF: failed to calculate rate table.\n"); return -1; } p.burst = tc_calc_xmittime(p.rate.rate, buffer); - p.rate.cell_log = Rcell_log; - p.rate.mpu = mpu; } p.mtu = mtu; if (p.peakrate.rate) { - if ((Pcell_log = tc_calc_rtable(p.peakrate.rate, ptab, Pcell_log, mtu, mpu)) < 0) { + p.peakrate.mpu = mpu; + if (tc_calc_rtable(&p.peakrate, ptab, Pcell_log, mtu) < 0) { fprintf(stderr, "POLICE: failed to calculate peak rate table.\n"); return -1; } - p.peakrate.cell_log = Pcell_log; - p.peakrate.mpu = mpu; } tail = NLMSG_TAIL(n); diff --git a/tc/q_cbq.c b/tc/q_cbq.c index f2b4ce8..df98312 100644 --- a/tc/q_cbq.c +++ b/tc/q_cbq.c @@ -137,12 +137,11 @@ static int cbq_parse_opt(struct qdisc_util *qu, int argc, char **argv, struct nl if (allot < (avpkt*3)/2) allot = (avpkt*3)/2; - if ((cell_log = tc_calc_rtable(r.rate, rtab, cell_log, allot, mpu)) < 0) { + r.mpu = mpu; + if (tc_calc_rtable(&r, rtab, cell_log, allot) < 0) { fprintf(stderr, "CBQ: failed to calculate rate table.\n"); return -1; } - r.cell_log = cell_log; - r.mpu = mpu; if (ewma_log < 0) ewma_log = TC_CBQ_DEF_EWMA; @@ -336,12 +335,11 @@ static int cbq_parse_class_opt(struct qdisc_util *qu, int argc, char **argv, str unsigned pktsize = wrr.allot; if (wrr.allot < (lss.avpkt*3)/2) wrr.allot = (lss.avpkt*3)/2; - if ((cell_log = tc_calc_rtable(r.rate, rtab, cell_log, pktsize, mpu)) < 0) { + r.mpu = mpu; + if (tc_calc_rtable(&r, rtab, cell_log, pktsize) < 0) { fprintf(stderr, "CBQ: failed to calculate rate table.\n"); return -1; } - r.cell_log = cell_log; - r.mpu = mpu; } if (ewma_log < 0) ewma_log = TC_CBQ_DEF_EWMA; diff --git a/tc/q_htb.c b/tc/q_htb.c index b579ebe..cca77fa 100644 --- a/tc/q_htb.c +++ b/tc/q_htb.c @@ -212,19 +212,17 @@ static int htb_parse_class_opt(struct qdisc_util *qu, int argc, char **argv, str opt.ceil.mpu = mpu; opt.rate.mpu = mpu; - if ((cell_log = tc_calc_rtable(opt.rate.rate, rtab, cell_log, mtu, mpu)) < 0) { + if (tc_calc_rtable(&opt.rate, rtab, cell_log, mtu) < 0) { fprintf(stderr, "htb: failed to calculate rate table.\n"); return -1; } opt.buffer = tc_calc_xmittime(opt.rate.rate, buffer); - opt.rate.cell_log = cell_log; - if ((ccell_log = tc_calc_rtable(opt.ceil.rate, ctab, cell_log, mtu, mpu)) < 0) { + if (tc_calc_rtable(&opt.ceil, ctab, ccell_log, mtu) < 0) { fprintf(stderr, "htb: failed to calculate ceil rate table.\n"); return -1; } opt.cbuffer = tc_calc_xmittime(opt.ceil.rate, cbuffer); - opt.ceil.cell_log = ccell_log; tail = NLMSG_TAIL(n); addattr_l(n, 1024, TCA_OPTIONS, NULL, 0); diff --git a/tc/q_tbf.c b/tc/q_tbf.c index 1fc05f4..c7b4f0f 100644 --- a/tc/q_tbf.c +++ b/tc/q_tbf.c @@ -170,21 +170,20 @@ static int tbf_parse_opt(struct qdisc_util *qu, int argc, char **argv, struct nl opt.limit = lim; } - if ((Rcell_log = tc_calc_rtable(opt.rate.rate, rtab, Rcell_log, mtu, mpu)) < 0) { + opt.rate.mpu = mpu; + if (tc_calc_rtable(&opt.rate, rtab, Rcell_log, mtu) < 0) { fprintf(stderr, "TBF: failed to calculate rate table.\n"); return -1; } opt.buffer = tc_calc_xmittime(opt.rate.rate, buffer); - opt.rate.cell_log = Rcell_log; - opt.rate.mpu = mpu; + if (opt.peakrate.rate) { - if ((Pcell_log = tc_calc_rtable(opt.peakrate.rate, ptab, Pcell_log, mtu, mpu)) < 0) { + opt.peakrate.mpu = mpu; + if (tc_calc_rtable(&opt.peakrate, ptab, Pcell_log, mtu) < 0) { fprintf(stderr, "TBF: failed to calculate peak rate table.\n"); return -1; } opt.mtu = tc_calc_xmittime(opt.peakrate.rate, mtu); - opt.peakrate.cell_log = Pcell_log; - opt.peakrate.mpu = mpu; } tail = NLMSG_TAIL(n); diff --git a/tc/tc_core.c b/tc/tc_core.c index 1ab0ba0..c713a18 100644 --- a/tc/tc_core.c +++ b/tc/tc_core.c @@ -69,10 +69,11 @@ unsigned tc_calc_xmitsize(unsigned rate, unsigned ticks) rtab[pkt_len>>cell_log] = pkt_xmit_time */ -int tc_calc_rtable(unsigned bps, __u32 *rtab, int cell_log, unsigned mtu, - unsigned mpu) +int tc_calc_rtable(struct tc_ratespec *r, __u32 *rtab, int cell_log, unsigned mtu) { int i; + unsigned bps = r->rate; + unsigned mpu = r->mpu; if (mtu == 0) mtu = 2047; @@ -88,6 +89,7 @@ int tc_calc_rtable(unsigned bps, __u32 *rtab, int cell_log, unsigned mtu, sz = mpu; rtab[i] = tc_calc_xmittime(bps, sz); } + r->cell_log=cell_log; return cell_log; } diff --git a/tc/tc_core.h b/tc/tc_core.h index a139da6..e98a7b4 100644 --- a/tc/tc_core.h +++ b/tc/tc_core.h @@ -13,7 +13,7 @@ long tc_core_time2ktime(long time); long tc_core_ktime2time(long ktime); unsigned tc_calc_xmittime(unsigned rate, unsigned size); unsigned tc_calc_xmitsize(unsigned rate, unsigned ticks); -int tc_calc_rtable(unsigned bps, __u32 *rtab, int cell_log, unsigned mtu, unsigned mpu); +int tc_calc_rtable(struct tc_ratespec *r, __u32 *rtab, int cell_log, unsigned mtu); int tc_setup_estimator(unsigned A, unsigned time_const, struct tc_estimator *est);