Robert Watson wrote:
On Wed, 7 Oct 2009, rihad wrote:
rihad wrote:
I've yet to test how this direct=0 improves extensive dummynet drops.
Ooops... After a couple of minutes, suddenly:
net.inet.ip.intr_queue_drops: 1284
Bumped it up a bit.
Yes, I was going to suggest that moving to deferred dispatch has
probably simply moved the drops to a new spot, the queue between the
ithreads and the netisr thread. In your setup, how many network
interfaces are in use, and what drivers?
bce -- Broadcom NetXtreme II (BCM5706/BCM5708) PCI/PCIe Gigabit Ethernet
adapter driver
device bce compiled into a 7.1-RELEASE-p8 kernel.
2 network cards: bce0 used for ~400-500 mbit/s input, bce1 for output,
i.e. acting as a smart router. It has 2 quad core CPUs.
Now the probability of drops (as monitored by netstat -s's "output
packets dropped due to no bufs, etc.") is definitely a function of
traffic load and the number of items in a ipfw table. I've just
decreased the size of the two tables from ~2600 to ~1800 each and the
drops instantly went away, even though the traffic passing through the
box didn't decrease, it even increased a bit due to now shaping fewer
clients (luckily "ipfw pipe tablearg" passes packets failing a table
lookup untouched).
If what's happening is that you're maxing out a CPU then moving to
multiple netisrs might help if your card supports generating flow IDs,
but most lower-end cards don't. I have patches to generate those flow
IDs in software rather than hardware, but there are some downsides to
doing so, not least that it takes cache line misses on the packet that
generally make up a lot of the cost of processing the packet.
My experience with most reasonable cards is that letting them doing the
work distribution with RSS and use multiple ithreads is a more
performant strategy than using software work distribution on current
systems, though.
So should we prefer a bunch of expensive quality 10 gig cards? Any you
would recommend?
Someone has probably asked for this already, but -- could you send a
snapshot of the top -SH output in the steady state? Let top run for a
few minutes and then copy/paste the first 10-20 lines into an e-mail.
Sure. Mind you: now there's only 1800 entries in each of the two ipfw
tables, so any drops have stopped. But it only takes another 200-300
entries to start dropping.
155 processes: 10 running, 129 sleeping, 16 waiting
CPU: 2.4% user, 0.0% nice, 2.0% system, 9.3% interrupt, 86.2% idle
Mem: 1691M Active, 1491M Inact, 454M Wired, 130M Cache, 214M Buf, 170M Free
Swap: 2048M Total, 12K Used, 2048M Free
PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
15 root 171 ki31 0K 16K CPU3 3 22.4H 97.85% idle: cpu3
14 root 171 ki31 0K 16K CPU4 4 23.0H 96.29% idle: cpu4
12 root 171 ki31 0K 16K CPU6 6 23.8H 94.58% idle: cpu6
16 root 171 ki31 0K 16K CPU2 2 22.5H 90.72% idle: cpu2
13 root 171 ki31 0K 16K CPU5 5 23.4H 90.58% idle: cpu5
18 root 171 ki31 0K 16K RUN 0 20.3H 85.60% idle: cpu0
17 root 171 ki31 0K 16K CPU1 1 910:03 78.37% idle: cpu1
11 root 171 ki31 0K 16K CPU7 7 23.8H 65.62% idle: cpu7
21 root -44 - 0K 16K CPU7 7 19:03 48.34% swi1: net
29 root -68 - 0K 16K WAIT 1 515:49 19.63% irq256: bce0
31 root -68 - 0K 16K WAIT 2 56:05 5.52% irq257: bce1
19 root -32 - 0K 16K WAIT 5 50:05 3.86% swi4:
clock sio
983 flowtools 44 0 12112K 6440K select 0 13:20 0.15% flow-capture
465 root -68 - 0K 16K - 3 51:19 0.00% dummynet
3 root -8 - 0K 16K - 1 7:41 0.00% g_up
4 root -8 - 0K 16K - 2 7:14 0.00% g_down
30 root -64 - 0K 16K WAIT 6 5:30 0.00% irq16: mfi0
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"