Re: dummynet dropping too many packets

Robert Watson Wed, 07 Oct 2009 05:22:51 -0700


On Wed, 7 Oct 2009, rihad wrote:

Robert Watson wrote:
On Wed, 7 Oct 2009, rihad wrote:
snapshot of the top -SH output in the steady state? Let top run for afew minutes and then copy/paste the first 10-20 lines into an e-mail.
Sure. Mind you: now there's only 1800 entries in each of the two ipfwtables, so any drops have stopped. But it only takes another 200-300entries to start dropping.
Could you do the same in the net.isr.direct=1 configuration so we cancompare?
net.isr.direct=1:

So it seems that CPU exhaustion is likely not the source of drops -- what Iwas looking for in both configurations were signs that any individual threadwas approaching 80% utilization, which in a peak load situation might mean ithitting 100% and therefore leading to packet loss for that reason.

The statistic you're monitoring has a couple of interpretations, but the mostlikely interpretation is overfilling the output queue on the network interfaceyou're transmitting on. In turn there are various possible reasons for thishappening, but the two most common would be:


(1) Average load is exceeding the transmit capacity of the driver/hardware
    pipeline -- the pipe is just too small.

(2) Peak capacity (burstiness) is exceeding the transmit capacity of the
    driver/hardware pipeline.

The questions that Luigi and others have been asking about your dummynetconfiguration are to some extent oriented around determining whether theburstiness introduced by dummynet could be responsible for that. Suggestionslike increasing timer resolution are intended to spread out the injection ofpackets by dummynet to attempt to reduce the peaks of burstiness that occurwhen multiple queues inject packets in a burst that exceeds the queue depthsupported by combined hardware descriptor rings and software transmit queue.

The two solutions, then are (a) to increase the timer resolution significantlyso that packets are injected in smaller bursts and (b) increase the queuecapacities. The hardware queue limits likely can't be raised w/o newhardware, but the ifnet transmit queue sizes can be increased. Timerresolution going up is almost certainly not a bad idea in your configuration,although does require a reboot as you have observed.

On a side note: one other possible interpretation of that statistic is thatyou're seeing fragmentation problems. Usually in forwarding scenarios this isunlikely. However, it wouldn't hurt to make sure you have LRO turned off onthe network interfaces you're using, assuming it's supported by the driver.


Robert N M Watson
Computer Laboratory
University of Cambridge

last pid: 92152; load averages: 0.99, 1.18, 1.15up 1+01:42:28 14:53:09

162 processes: 9 running, 136 sleeping, 17 waiting
CPU:  2.1% user,  0.0% nice,  5.4% system,  7.0% interrupt, 85.5% idle
Mem: 1693M Active, 1429M Inact, 447M Wired, 197M Cache, 214M Buf, 170M Free
Swap: 2048M Total, 12K Used, 2048M Free

 PID USERNAME   PRI NICE   SIZE    RES STATE  C   TIME   WCPU COMMAND
  12 root       171 ki31     0K    16K CPU6   6  24.3H 100.00% idle: cpu6
  13 root       171 ki31     0K    16K CPU5   5  23.8H 95.95% idle: cpu5
  14 root       171 ki31     0K    16K CPU4   4  23.4H 93.12% idle: cpu4
  16 root       171 ki31     0K    16K CPU2   2  23.0H 90.19% idle: cpu2
  11 root       171 ki31     0K    16K CPU7   7  24.2H 87.26% idle: cpu7
  15 root       171 ki31     0K    16K CPU3   3  22.8H 86.18% idle: cpu3
  18 root       171 ki31     0K    16K RUN    0  20.6H 84.96% idle: cpu0
  17 root       171 ki31     0K    16K CPU1   1 933:23 47.85% idle: cpu1
  29 root       -68    -     0K    16K WAIT   1 522:02 46.88% irq256: bce0
 465 root       -68    -     0K    16K -      7  55:15 12.65% dummynet
  31 root       -68    -     0K    16K WAIT   2  57:29  4.74% irq257: bce1
  21 root       -44    -     0K    16K WAIT   0  34:55  4.64% swi1: net

19 root -32 - 0K 16K WAIT 4 51:41 3.96% swi4: clocksio

  30 root       -64    -     0K    16K WAIT   6   5:43  0.73% irq16: mfi0

Almost 2000 entries in the table, traffic load= 420-430 mbps, drops haven'tyet started.


Previous net.isr.direct=0:


155 processes: 10 running, 129 sleeping, 16 waiting
CPU:  2.4% user,  0.0% nice,  2.0% system,  9.3% interrupt, 86.2% idle

Mem: 1691M Active, 1491M Inact, 454M Wired, 130M Cache, 214M Buf, 170MFree

Swap: 2048M Total, 12K Used, 2048M Free

 PID USERNAME   PRI NICE   SIZE    RES STATE  C   TIME   WCPU COMMAND
  15 root       171 ki31     0K    16K CPU3   3  22.4H 97.85% idle: cpu3
  14 root       171 ki31     0K    16K CPU4   4  23.0H 96.29% idle: cpu4
  12 root       171 ki31     0K    16K CPU6   6  23.8H 94.58% idle: cpu6
  16 root       171 ki31     0K    16K CPU2   2  22.5H 90.72% idle: cpu2
  13 root       171 ki31     0K    16K CPU5   5  23.4H 90.58% idle: cpu5
  18 root       171 ki31     0K    16K RUN    0  20.3H 85.60% idle: cpu0
  17 root       171 ki31     0K    16K CPU1   1 910:03 78.37% idle: cpu1
  11 root       171 ki31     0K    16K CPU7   7  23.8H 65.62% idle: cpu7
  21 root       -44    -     0K    16K CPU7   7  19:03 48.34% swi1: net
  29 root       -68    -     0K    16K WAIT   1 515:49 19.63% irq256: bce0
  31 root       -68    -     0K    16K WAIT   2  56:05  5.52% irq257: bce1

19 root -32 - 0K 16K WAIT 5 50:05 3.86% swi4: clocksio

 983 flowtools   44    0 12112K  6440K select 0  13:20  0.15% flow-capture
 465 root       -68    -     0K    16K -      3  51:19  0.00% dummynet
   3 root        -8    -     0K    16K -      1   7:41  0.00% g_up
   4 root        -8    -     0K    16K -      2   7:14  0.00% g_down
  30 root       -64    -     0K    16K WAIT   6   5:30  0.00% irq16: mfi0

_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: dummynet dropping too many packets

Reply via email to