Re: dummynet dropping too many packets

Robert Watson Wed, 07 Oct 2009 06:41:00 -0700


On Wed, 7 Oct 2009, rihad wrote:

Suggestions like increasing timer resolution are intended to spread out theinjection of packets by dummynet to attempt to reduce the peaks ofburstiness that occur when multiple queues inject packets in a burst thatexceeds the queue depth supported by combined hardware descriptor rings andsoftware transmit queue.
Raising HZ from 1000 to 2000 has helped. There are now 200-300 globaldrops/s, as opposed to 300-1000 with HZ=1000. Or maybe net.isr.direct from 1to 0 help. Or maybe hash_size from 64 to 256. Or maybe...

Or maybe other random factors such as traffic load corresponding to majorsports events, etc. :-)

It's also possible that combining multiple changes cancels out the effect ofone or another change. Given the rather large number of possiblecombinations of things to try, I'd suggest being fairly strategic in how youtry them. Starting with just an original config + significant HZ increase isprobably the best starting point. Changing hash_size is really about reducingCPU use, so if in the whole you're not getting close to the capacity of a corefor any given thread involved in the work, it may not make much difference(tuning these data structures is a bit of a black art).

The two solutions, then are (a) to increase the timer resolutionsignificantly so that packets are injected in smaller bursts


But isn't that bad that it can actually become worse?  From /sys/conf/NOTES:

# The granularity of operation is controlled by the kernel option HZ whose
# default value (1000 on most architectures) means a granularity of 1ms
# (1s/HZ).  Historically, the default was 100, but finer granularity is
# required for DUMMYNET and other systems on modern hardware.  There are
# reasonable arguments that HZ should, in fact, be 100 still; consider,
# that reducing the granularity too much might cause excessive overhead in
# clock interrupt processing, potentially causing ticks to be missed and thus
# actually reducing the accuracy of operation.

Right: we fire the timer on every CPU at 1/HZ seconds, which means quite a lotof work being done. On systems where timers are proportionally more expensive-- especially when using hardware virtualization, for example, we do recommendtuning the timers down. And our boot loader will actually do it for you: weauto-detect vmware, parallels, kqemu, virtualbox, etc, and adjust the timerrate from from 1000 to 100 during the boot.

That said, in your configuration I see little argument for a lower timer rate:you need to burst packets at frequent intervals or risk overfilling queues,and the overheads of additional timer tickets on your system shouldn't be toobad as you have both very fast hardware and a lot of idle time.


I would suggest making just the HZ -> 4000 change for now and see how it goes.

and (b) increase the queue capacities. The hardware queue limits likelycan't be raised w/o new hardware, but the ifnet transmit queue sizes can beincreased.
Can someone please say how to increase the "ifnet transmit queue sizes"?

Unfortunately, I fear that this is driver-specific, and in the case of bcerequires a recompile. In the driver init code in if_bce, the following codeappears:


        ifp->if_snd.ifq_drv_maxlen = USABLE_TX_BD;
        IFQ_SET_MAXLEN(&ifp->if_snd, ifp->if_snd.ifq_drv_maxlen);
        IFQ_SET_READY(&ifp->if_snd);

Which evaluates to a architecture-specific value due to varying pagesize. Youmight just try forcing it to 1024.

Timer resolution going up is almost certainly not a bad idea in yourconfiguration, although does require a reboot as you have observed.
OK, I'll try HZ=4000, but there are some required servers likeflowtools/radius/mysql/perl app that are also running.


That should be fine.

On a side note: one other possible interpretation of that statistic is thatyou're seeing fragmentation problems. Usually in forwarding scenarios thisis unlikely. However, it wouldn't hurt to make sure you have LRO turnedoff on the network interfaces you're using, assuming it's supported by thedriver.
I don't think fragments are the problem. The numbers are too small ;-)
$ netstat -s|fgrep fragment
       5318 fragments received
       147 fragments dropped (dup or out of space)
       5157 fragments dropped after timeout
       4088 output datagrams fragmented
       8180 fragments created
       0 datagrams that can't be fragmented
There's no such option as LRO shown, so I guess it's off:options=1bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4>


That probably rules that out as a source of problems then.

Robert
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: dummynet dropping too many packets

Reply via email to