Robert Watson wrote:
Suggestions like increasing timer resolution are intended to spread out
the injection of packets by dummynet to attempt to reduce the peaks of
burstiness that occur when multiple queues inject packets in a burst
that exceeds the queue depth supported by combined hardware descriptor
rings and software transmit queue.
Raising HZ from 1000 to 2000 has helped. There are now 200-300 global
drops/s, as opposed to 300-1000 with HZ=1000. Or maybe net.isr.direct
from 1 to 0 help. Or maybe hash_size from 64 to 256. Or maybe...
The two solutions, then are (a) to increase the timer resolution
significantly so that packets are injected in smaller bursts
But isn't that bad that it can actually become worse? From /sys/conf/NOTES:
# The granularity of operation is controlled by the kernel option HZ whose
# default value (1000 on most architectures) means a granularity of 1ms
# (1s/HZ). Historically, the default was 100, but finer granularity is
# required for DUMMYNET and other systems on modern hardware. There are
# reasonable arguments that HZ should, in fact, be 100 still; consider,
# that reducing the granularity too much might cause excessive overhead in
# clock interrupt processing, potentially causing ticks to be missed and
thus
# actually reducing the accuracy of operation.
and (b) increase the queue capacities. The hardware queue limits likely can't
be raised w/o new hardware, but the ifnet transmit queue sizes can be
increased.
Can someone please say how to increase the "ifnet transmit queue sizes"?
Timer resolution going up is almost certainly not a bad idea in your
configuration, although does require a reboot as you have observed.
OK, I'll try HZ=4000, but there are some required servers like
flowtools/radius/mysql/perl app that are also running.
On a side note: one other possible interpretation of that statistic is
that you're seeing fragmentation problems. Usually in forwarding
scenarios this is unlikely. However, it wouldn't hurt to make sure you
have LRO turned off on the network interfaces you're using, assuming
it's supported by the driver.
I don't think fragments are the problem. The numbers are too small ;-)
$ netstat -s|fgrep fragment
5318 fragments received
147 fragments dropped (dup or out of space)
5157 fragments dropped after timeout
4088 output datagrams fragmented
8180 fragments created
0 datagrams that can't be fragmented
There's no such option as LRO shown, so I guess it's off:
options=1bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4>
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"