Debugging em(4) driver
Good afternoon, I am trying to run down a root cause of a link failure between two of my HP Proliant DL350's. pciconf shows them as the 82571EB chip. These are a 4 port card on the HP. We are doing some routing code in the kernel and have a call to our entry function in the forwarding path in ip_input(). We have WITNESS and INVARIANTS enable in our kernel configuration. Our current testbed has these two HP's em3 ports connected via a ethernet crossover cable. I am generating traffic using 'iperf -u -b 20M' and these links show up as 1000Mbps, so I should not be saturating the interface. The traffic is fine for a while, but then we start to fail to see anymore traffic. A ping started before generating the traffic just stops almost as soon as the iperf traffic begins. I cannot find any error messages indicating that there is a problem with em(4). Is there anything I can use to debug this issue? If not, then I guess I will need to put in some debugging in the xmit path of em(4). Another issue that sometimes raises it head, especially if I have a lot of printf's occurring on a per-packet basis, is I get a double deallocation panic from uma_dbg_free(). When this has occurred I have freed the packet using m_freem() (Since we don't want the packet to be forwarded). But then it looks like em(4) gets an interrupt and frees the packet at em_init_locked()+0x91f. Which may be in the receive handler code. Any and all help/ideas will be appreciated. Output of 'ifconfig em3' em3: flags=8843 metric 0 mtu 1500 options=19b ether 00:1f:29:5f:c6:aa inet 172.16.13.30 netmask 0xff00 broadcast 172.16.13.255 media: Ethernet autoselect (1000baseT ) status: active Output of 'netstat -I em3' NameMtu Network Address Ipkts IerrsOpkts Oerrs Coll em31500 00:1f:29:5f:c6:aa11099 8932211298 0 0 em31500 172.16.13.0 172.16.13.30 11096 -11296 - - pciconf -lv shows - e...@pci0:21:0:0:class=0x02 card=0x704b103c chip=0x10bc8086 rev=0x06 hdr=0x00 vendor = 'Intel Corporation' device = '82571EB Gigabit Ethernet Controller (Copper)' class = network subclass = ethernet e...@pci0:21:0:1:class=0x02 card=0x704b103c chip=0x10bc8086 rev=0x06 hdr=0x00 vendor = 'Intel Corporation' device = '82571EB Gigabit Ethernet Controller (Copper)' class = network subclass = ethernet e...@pci0:22:0:0:class=0x02 card=0x704b103c chip=0x10bc8086 rev=0x06 hdr=0x00 vendor = 'Intel Corporation' device = '82571EB Gigabit Ethernet Controller (Copper)' class = network subclass = ethernet e...@pci0:22:0:1:class=0x02 card=0x704b103c chip=0x10bc8086 rev=0x06 hdr=0x00 vendor = 'Intel Corporation' device = '82571EB Gigabit Ethernet Controller (Copper)' class = network subclass = ethernet Thanks, Patrick ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Debugging em(4) driver
On 11/13/2010 02:27 PM, Ryan Stone wrote: It looks to me that you're getting a ton of input drops. That's presumably the cause of your issue. You can get the em driver to print debug information to the console by running: # sysctl dev.em.3.stats=1 # sysctl.dev.em.3.debug=1 The output should be available in dmesg and /var/log/messages Hopefully that can shed some light on the nature of the drops. Ryan, Thanks for the tip. But I see I forgot to mention this was FreeBSD 8.0. The em(4) driver is actually the one found in FreeBSD 8.1 as we needed the AltQ fixes. However, I do not see these sysctl's in the code or when I do a 'sysctl dev.em.3' It looks like between 8.0 and 8.1 there was a change? I now see a if_lem.c which has the sysctl you are referring too. Here is the output of my sysctl npxk3# sysctl dev.em.3.stats=1 sysctl: unknown oid 'dev.em.3.stats' npxk3# sysctl dev.em.3 dev.em.3.%desc: Intel(R) PRO/1000 Network Connection 7.0.5 dev.em.3.%driver: em dev.em.3.%location: slot=0 function=1 dev.em.3.%pnpinfo: vendor=0x8086 device=0x10bc subvendor=0x103c subdevice=0x704b class=0x02 dev.em.3.%parent: pci22 dev.em.3.nvm: -1 dev.em.3.rx_int_delay: 0 dev.em.3.tx_int_delay: 66 dev.em.3.rx_abs_int_delay: 66 dev.em.3.tx_abs_int_delay: 66 dev.em.3.rx_processing_limit: 100 dev.em.3.link_irq: 0 dev.em.3.mbuf_alloc_fail: 0 dev.em.3.cluster_alloc_fail: 0 dev.em.3.dropped: 0 dev.em.3.tx_dma_fail: 0 dev.em.3.fc_high_water: 30720 dev.em.3.fc_low_water: 29220 dev.em.3.mac_stats.excess_coll: 0 dev.em.3.mac_stats.symbol_errors: 0 dev.em.3.mac_stats.sequence_errors: 0 dev.em.3.mac_stats.defer_count: 0 dev.em.3.mac_stats.missed_packets: 0 dev.em.3.mac_stats.recv_no_buff: 0 dev.em.3.mac_stats.recv_errs: 0 dev.em.3.mac_stats.crc_errs: 0 dev.em.3.mac_stats.alignment_errs: 0 dev.em.3.mac_stats.coll_ext_errs: 0 dev.em.3.mac_stats.rx_overruns: 0 dev.em.3.mac_stats.watchdog_timeouts: 0 dev.em.3.mac_stats.xon_recvd: 0 dev.em.3.mac_stats.xon_txd: 0 dev.em.3.mac_stats.xoff_recvd: 0 dev.em.3.mac_stats.xoff_txd: 0 dev.em.3.mac_stats.total_pkts_recvd: 58365716 dev.em.3.mac_stats.good_pkts_recvd: 58365716 dev.em.3.mac_stats.bcast_pkts_recvd: 9 dev.em.3.mac_stats.mcast_pkts_recvd: 0 dev.em.3.mac_stats.rx_frames_64: 16 dev.em.3.mac_stats.rx_frames_65_127: 5612 dev.em.3.mac_stats.rx_frames_128_255: 10355 dev.em.3.mac_stats.rx_frames_256_511: 29103556 dev.em.3.mac_stats.rx_frames_512_1023: 6633 dev.em.3.mac_stats.rx_frames_1024_1522: 29239544 dev.em.3.mac_stats.good_octets_recvd: 0 dev.em.3.mac_stats.good_octest_txd: 0 dev.em.3.mac_stats.total_pkts_txd: 165551 dev.em.3.mac_stats.good_pkts_txd: 165551 dev.em.3.mac_stats.bcast_pkts_txd: 8 dev.em.3.mac_stats.mcast_pkts_txd: 2 dev.em.3.mac_stats.tx_frames_64: 19 dev.em.3.mac_stats.tx_frames_65_127: 5573 dev.em.3.mac_stats.tx_frames_128_255: 10348 dev.em.3.mac_stats.tx_frames_256_511: 3308 dev.em.3.mac_stats.tx_frames_512_1023: 6680 dev.em.3.mac_stats.tx_frames_1024_1522: 139623 dev.em.3.mac_stats.tso_txd: 0 dev.em.3.mac_stats.tso_ctx_fail: 0 dev.em.3.interrupts.asserts: 0 dev.em.3.interrupts.rx_pkt_timer: 0 dev.em.3.interrupts.rx_abs_timer: 0 dev.em.3.interrupts.tx_pkt_timer: 0 dev.em.3.interrupts.tx_abs_timer: 0 dev.em.3.interrupts.tx_queue_empty: 0 dev.em.3.interrupts.tx_queue_min_thresh: 0 dev.em.3.interrupts.rx_desc_min_thresh: 0 dev.em.3.interrupts.rx_overrun: 0 dev.em.3.host.breaker_tx_pkt: 0 dev.em.3.host.host_tx_pkt_discard: 0 dev.em.3.host.rx_pkt: 0 dev.em.3.host.breaker_rx_pkts: 0 dev.em.3.host.breaker_rx_pkt_drop: 0 dev.em.3.host.tx_good_pkt: 0 dev.em.3.host.breaker_tx_pkt_drop: 0 dev.em.3.host.rx_good_bytes: 0 dev.em.3.host.tx_good_bytes: 0 dev.em.3.host.length_errors: 0 dev.em.3.host.serdes_violation_pkt: 0 dev.em.3.host.header_redir_missed: 0 Thanks, Patrick ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: routed source code
On 11/13/2010 05:23 PM, Milen Dzhumerov wrote: Hi all, We're investigating some ways to perform symbolic execution of distributed systems and we're looking for real-world programs to test. The "routed" daemon[1] which is included with FreeBSD seemed like a good candidate and I was wondering whether anyone can point me to its implementation location in the source code repositories. Thanks, Milen Milen, routed resides in /sbin, so look in /usr/src/sbin/routed. Patrick ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Debugging em(4) driver
On 11/13/2010 09:08 PM, Jack Vogel wrote: The stats changed quite a bit for 8.1, they are much more informative now, and they can be collected from anywhere not just the console. I don't quite understand what you are trying to do, debug an em problem or just debug a problematic situation by using em?? Mike is right, the driver in HEAD has some significant fixes and would be the best thing to use. We were noticing a lot of slow down on the link between the two HPs. I would start a normal ping between the two boxes, start the traffic generation and I would see the ping's completely stop, even though I could see I was still getting traffic on the em(4) interface. Once the pings stop, we eventually started seeing failure in our network app that needs to query it's peer. It would timeout trying to send a notification message to it's peer. I wanted to see if there was a failure somewhere in the interface layer such as dropping packets. Plus I wanted to ensure the ringbuffer em(4) was using wasn't starving for packets. Patrick ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Setting up a running FreeBSD/PCBSD system to enter kgdb on panic
On 4/5/11 10:38 AM, fbsdm...@dnswatch.com wrote: > > On Mon, April 4, 2011 5:07 pm, Eitan Adler wrote: >> On Mon, Apr 4, 2011 at 7:35 PM, David Somayajulu >> wrote: >> >>> Hi All, >>> Is there some way I can setup a running FreeBSD - (I use PCBSD7.2) - to >>> break into kgdb when the system panics. I am trying to get a stack >>> trace when "Fatal trap 12: page fault while in kernel mode" happens. >> >> debug.debugger_on_panic=1 > > Does this line go in C:\Windows\system32\win.ini? > No, it's a sysctl line. Issue as either root or via 'sudo' - % sudo sysctl debug.debugger_on_panic=1 Assuming your kernel has been built with DDB/KDB enabled. Patrick ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Intel Pro/1000 PT Quad Port Bypass Server Adapter
All, We have a requirement for fail-to-wire that we are meeting by using these types of bypass NIC's. We have some from Silicom where they provided us a modified em(4) driver, but now we have a few NICs coming that are straight from Intel. However the website doesn't list the correct driver(s) for this card. Will the current (or even the HEAD) em(4) driver work for this type of NIC? I don't see anything in the sysctl for enabling/disabling the bypass mode. Thanks for any help, Patrick ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Usage of IFQ_DEQUEUE vs IFQ_DRV_DEQUEUE
Can somebody confirm my assumption on the following: If I am supporting ALTQ in a driver, then I should use the IFQ_DRV_DEQUEUE() macro. If I am not supporting ALTQ then it is okay IFQ_DEQUEUE() macro? If not what's the difference? Slightly confused... Thanks, Patrick ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Usage of IFQ_DEQUEUE vs IFQ_DRV_DEQUEUE
On 8/24/11 11:42 AM, Sergey Kandaurov wrote: > On 24 August 2011 22:12, Patrick Mahan wrote: >> Can somebody confirm my assumption on the following: >> >> If I am supporting ALTQ in a driver, then I should use the >> IFQ_DRV_DEQUEUE() macro. If I am not supporting ALTQ then >> it is okay IFQ_DEQUEUE() macro? If not what's the difference? >> >> Slightly confused... >> > > Just in case, have you read man 9 altq? It has a good description of > these macros. > Sergey, That is exactly what I was looking for. Don't know how I missed that (sometimes get caught up in looking at the source) Thanks, Patrick ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Any plans to upgrade the tftp client and server images for FreeBSD?
Not sure if this is the correct list, but I am working as part of a kernel team that is using FreeBSD 8.0 for it's base OS. We have had a ongoing issue with our bootloader (u-boot) with it being unable to tftp from the tftp server running on our FreeBSD server. We traced the issue down to the tftp code in u-boot was using the 'blksize' option and was not handling the option nak correctly. Since we didn't want to have to require a change in the bootloader, it was instead decided to fix the tftp server to support RFC 2348. After looking around the internet, we found that the tftp server under NetBSD did support RFC 2348. This made it an easy port, one line change to the usr.bin/tftp/Makefile and a slight change to libexec/tftpd.c (changed the name of an internal function from 'sendfile' back to 'xmitfile'). It has been working just fine for us. So I have been tasked with asking if the FreeBSD developers would like this code for future inclusion (or one of the current developers could just grab it from NetBSD). Reading the website it seems to contribute we need to be running -CURRENT which is not currently possible (other reasons we are using 8.0. This is actually a recent upgrade as we were previously using FreeBSD 6.2). So if this is something that could be useful, I have the code and a patch to modify the original NetBSD code to contribute. Also, if it is already done, then I was not able to view it (I tried the CVS and SVN web source browser and did not see any changes related to adding RFC 2348 support. Thanks for listening, Patrick ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Anon port selection
See inline - Janne Huttunen wrote: Hi! The selection of anonymous port in FreeBSD seems to act a bit weird (bug?). This was first observed on actual use on FreeBSD 6.2, but I have verified that the it behaves the same on a December snapshot of CURRENT too. 1. A process creates an UDP socket and sends a packet from it (at which point a local port is assigned for it). 2. Another process creates an UDP socket, sets SO_REUSEADDR (or SO_REUSEPORT) and sends a packet from it (at which point a local port is assigned for it). Every now and then it happens that the second process gets the same local port as the first one. If the second process doesn't set the socket option this won't happen. Note however, that the first process does not have to cooperate in any way i.e. it does not set any options. Now, I'm fairly newbie when it comes to the FreeBSD IP stack, but it seems to me that this phenomenon is caused by the code in in_pcbconnect_setup(). If the local port is zero the in_pcbbind_setup() is called to select a port. That routine is called with the local address set to the source address selected for the outgoing packet, but when the port has been selected, it is committed with INADDR_ANY as the local address. Then when the second process in in_pcbbind_setup() tries to check if the port is already in use, it won't match the INADDR_ANY and assigns the same port again. Well it has been almost 20 years since I first ran across this issue and was told back then that it was "as designed". I believe you will see that this only happens when INADDR_ANY is in effect. If instead you use a specific IP address as your source it should not happen. I have not had a chance to really go over the FreeBSD TCP/IP stack since the beginnings of FreeBSD back in the early 90's (we were using basically the same code for our product on a different architecture). As an example of what the person was explaining he pointed to the BIND code which expressly binds to each interface IP address instead of too INADDR_ANY to prevent snooping. I apologize if I am somewhat off base, having only re-entered playing with FreeBSD in the last few months. Patrick ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Issues with em(4) device under FreeBSD 8.0
All, I have seen a few mentions on the mailing lists in regard to issues with em(4) and FreeBSD 8.0 with regard to throughput. We are also seeing similar issues on HP Proliant systems with this HP GE interfaces. Previously we were running FreeBSD 6.2 and iperf was showing ~900 Mbits/sec between two directly connected systems. After the upgrade, iperf only shows around ~350 Mbits/sec. This seems only to be happening on the HP's. When we upgrade another x86 box (privately built) we are seeing ~900 Mbits/sec even to one of the HP systems. I haven't seen anything yet to account for this behavior. Has anyone else seen similar issues? Thanks, Patrick ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
freebsd-net@freebsd.org
> >[12] >http://caia.swin.edu.au/newtcp/tools/caia_modularcc_v0.9.4_9.x.r203910.patch > >[13] http://caia.swin.edu.au/newtcp/tools/modularcc-readme-0.9.4.txt > I believe these are incorrect. I find these documents at the following URLs: [12] http://caia.swin.edu.au/urp/newtcp/tools/caia_modularcc_v0.9.4_9.x.r203910.patch [13] http://caia.swin.edu.au/urp/newtcp/tools/modularcc-readme-0.9.4.txt Thanks, Patrick ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Multicast under FBSD 8.0
All, Hoping for a little insight as I am not a user of multicast nor do I know much about the servers that use them. In my day job, I am helping with the moving of my company's product from FreeBSD 6.2 (i386) to FreeBSD 8.0 (amd64). One of the daemons wants to use 224.0.0.9 (routed? rip?) multicast group. The problem is this worked fine on 6.2 but when we moved to 8.0 the daemon started reporting "Network unreachable" errors when it was trying to send a packet out to the multicast group. I tracked it down to the following in the routing table: % netstat -nr Routing tables Internet: DestinationGatewayFlagsRefs Use Netif Expire default10.10.1.1 UG 00 bce0 10.10.0.0/16 link#5 U 3 1253 bce0 ... 224.0.0.2 127.0.0.1 UH 00lo0 224.0.0.9 127.0.0.1 UH 00lo0 Notice that 224.0.0.9 has a route pointing to the loopback interface, even though the code uses the IP_MULTICAST_IF socket option to specify the interface. If this entry does not exist or points to a true physical interface, then there is no issue. I did some research on this and found this code all changed in in_pcb.c as part of revision 105629 for FreeBSD 7.2. But I don't understand why he change and why the loopback was no longer allowed. I get asked daily by the developers of this daemon for the reason, so I was hoping to get some enlightment here. Thanks for listening, Patrick ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Multicast under FBSD 8.0
Pierre, The RIP source is all the BSD boxes in the current broadcast domain that run our product. The app does pick which interface to send the message out on and sets that using the appropriate MULTICAST setsockoptions. It is built for 8.0 (Or rather it is built using the 8.0 toolchain :-)) However, I believe this was something that was automatically configured for each box under 6.2 as part of the normal configuration. We have since removed it and the app is now working fine. I was being asked why the change occurred, and since multicast is not something I am familiar with, I turned to this list. Thanks, Patrick Pierre Lamy wrote: Multicast traffic doesn't get routed in a traditional sense, it sort of gets repackaged for delivery to requesting recipients. And 224/24 should never get retransmitted, it's for within a broadcast domain only. Is the RIP source the BSD box itself? If so, the app should determine what interfaces to send on, and then use that. Can you recompile the daemon for 8? Pierre Patrick Mahan wrote: All, Hoping for a little insight as I am not a user of multicast nor do I know much about the servers that use them. In my day job, I am helping with the moving of my company's product from FreeBSD 6.2 (i386) to FreeBSD 8.0 (amd64). One of the daemons wants to use 224.0.0.9 (routed? rip?) multicast group. The problem is this worked fine on 6.2 but when we moved to 8.0 the daemon started reporting "Network unreachable" errors when it was trying to send a packet out to the multicast group. I tracked it down to the following in the routing table: % netstat -nr Routing tables Internet: DestinationGatewayFlagsRefs Use Netif Expire default10.10.1.1 UG 00 bce0 10.10.0.0/16 link#5 U 3 1253 bce0 ... 224.0.0.2 127.0.0.1 UH 00lo0 224.0.0.9 127.0.0.1 UH 00lo0 Notice that 224.0.0.9 has a route pointing to the loopback interface, even though the code uses the IP_MULTICAST_IF socket option to specify the interface. If this entry does not exist or points to a true physical interface, then there is no issue. I did some research on this and found this code all changed in in_pcb.c as part of revision 105629 for FreeBSD 7.2. But I don't understand why he change and why the loopback was no longer allowed. I get asked daily by the developers of this daemon for the reason, so I was hoping to get some enlightment here. Thanks for listening, Patrick ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Looking for some education on ALTQ
I am the first to admit I don't understand ALTQ and it's impact on QoS but that said, I am trying to learn. I have the following three systems. All systems are running FreeBSD 8.0-p2 Release. I am attempting to learn how AltQ can be used to prioritized and setup banddwith pipelines. At the end of this message is the topology and all (I hope) the relevent info for someone to help me understand what is happening. I have setup AltQ on em0 on NPX3 with a queue set to run at 1.9Mbps. However, in testing by generating UDP traffic on NPX4 and having it received on NPX3 I am seeing (what I believe) to be really low throughput scores for queue test7788 (see the pf.conf below). pfctl -vv -s queue shows the bandwidth starts at 4.84 Kb/s, not 1.9 Mbps I was expecting. Am I 1. Not driving the datastream high enough to eat 1.9 Mbps (what should I run as my iperf -b value? 2. Miss configuring AltQ? or 3. There is a bug in AltQ? Thanks for the education - Patrick Network topology: +--+ +--+ | | | | | NPX4 | |NPX3 | | (em1)+ = +(em2) | | | | | | | |(em0) | +--+ +--+---+ I I I I I I I I +--+---+ |(em0) | | | |NPX1 | | | | | +--+ NPX4: em1: 172.16.34.40/24 em1: flags=8843 metric 0 mtu 1500 options=19b ether 00:1f:29:5f:c3:b8 inet 172.16.34.40 netmask 0xff00 broadcast 172.16.34.255 media: Ethernet autoselect (1000baseT ) status: active NPX3: em2: 172.16.34.30/24 em2: flags=8843 metric 0 mtu 1500 options=19b ether 00:1f:29:5f:c6:ab inet 172.16.34.30 netmask 0xff00 broadcast 172.16.34.255 media: Ethernet autoselect (1000baseT ) status: active em0: 172.16.13.30/24 em0: flags=8843 metric 0 mtu 1500 options=19b ether 00:1f:29:5f:c6:a9 inet 172.16.13.30 netmask 0xff00 broadcast 172.16.13.255 media: Ethernet autoselect (1000baseT ) status: active NPX1: em0: 172.16.13.10/24 em0: flags=8843 metric 0 mtu 1500 options=19b ether 00:1c:c4:47:1a:35 inet 172.16.13.10 netmask 0xff00 broadcast 172.16.13.255 media: Ethernet autoselect (1000baseT ) status: active NPX4 IPv4 Routing table npx4# netstat -nr Routing tables Internet: DestinationGatewayFlagsRefs Use Netif Expire default10.10.1.1 UG 0 248 bce0 10.10.0.0/16 link#5 U 4 1000928 bce0 10.10.20.44link#5 UHS 0 296522lo0 122.16.1.0/24 122.16.2.11UGS 040392em3 122.16.1.3 122.16.2.11UGHS0 251513em3 122.16.2.0/24 122.16.2.3 U 03em3 122.16.2.3 link#4 UHS 00lo0 127.0.0.0/8127.0.0.1 UR 00lo0 127.0.0.1 link#8 UH 0 45442955lo0 172.16.13.0/24 172.16.34.30 UGS 0 1754682em1 172.16.24.0/24 172.16.24.40 U 0 7805039em0 172.16.24.40 link#1 UHS 00lo0 172.16.34.0/24 172.16.34.40 U 010425em1 172.16.34.40 link#2 UHS 00lo0 NPX3 IPv4 Routing Table npx3# netstat -nr Routing tables Internet: DestinationGatewayFlagsRefs Use Netif Expire default10.10.1.1 UG 00 bce0 10.10.0.0/16 link#5 U 521972 bce0 10.10.20.43link#5 UHS 0 5113lo0 127.0.0.0/8127.0.0.1 UR 00lo0 127.0.0.1 link#9 UH 0 310084lo0 172.16.13.0/24 link#1 U 0 1754798em0 172.16.13.30 link#1 UHS 00lo0 172.16.23.0/24 link#2 U 0 138em1 172.16.23.30
AltQ throughput issues (long message)
All, I am looking for (again) some understanding of AltQ and how it works w.r.t. packet through put. I posted earlier this month regarding how to initially configure AltQ (thanks to everyone's help) and now have it working over the em(4) drive on a FreeBSD 8.0 platform (HP DL350 G5). I had to bring the em(4) driver from the 8-Stable branch, but it is working just fine so far (needed to add the drbr_needs_enqueue() to if_var.h). I have now gone back to trying to setup up one queue with a bandwith of 1900 Kbs (1.9 Mbs). I ran a test with 'iperf' using udp and setting the bandwidth to 25 Mbs. I then ran a test setting the queue bandwith to 20 Mbs and running 'iperf' again using udp and 25 Mbs bandwith. In both cases, the throughput only seems to be 89% of the requested throughput. Test 1 AltQ queue bandwidth 1.9 Mbs, iperf -b 25M pfctl -vv -s queue reported: queue root_em0 on em0 bandwidth 1Gb priority 0 cbq( wrr root ) {test7788} [ pkts: 28298 bytes: 42771988 dropped pkts: 0 bytes: 0 ] [ qlength: 0/ 50 borrows: 0 suspends: 0 ] [ measured: 140.8 packets/s, 1.70Mb/s ] queue test7788 on em0 bandwidth 1.90Mb cbq( default ) [ pkts: 28298 bytes: 42771988 dropped pkts: 397077 bytes: 600380424 ] [ qlength: 50/ 50 borrows: 0 suspends: 3278 ] [ measured: 140.8 packets/s, 1.70Mb/s ] iperf reported [ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams [ 3] 0.0-200.4 sec 39.7 MBytes 1.66 Mbits/sec 6.998 ms 397190/425533 (93%) Test 2 AltQ queue bandwidth 20 Mbs, iperf -b 25M pfctl -vv -s queue reported: queue root_em0 on em0 bandwidth 1Gb priority 0 cbq( wrr root ) {test7788} [ pkts: 356702 bytes: 539329126 dropped pkts: 0 bytes: 0 ] [ qlength: 0/ 50 borrows: 0 suspends: 0 ] [ measured: 1500.2 packets/s, 18.15Mb/s ] queue test7788 on em0 bandwidth 20Mb cbq( default ) [ pkts: 356702 bytes: 539329126 dropped pkts: 149198 bytes: 225587376 ] [ qlength: 46/ 50 borrows: 0 suspends: 39629 ] [ measured: 1500.2 packets/s, 18.15Mb/s ] iperf reported [ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams [ 3] 0.0-240.0 sec505 MBytes 17.6 Mbits/sec 0.918 ms 150584/510637 (29%) Why can AltQ not drive it at full bandwidth? This is just some preliminary testing, but I want to scale this up to use all available AltQ CBQ queues for various operations. As always, my knowledge is always increased with I ask questions on this list. Thanks, Patrick === Test Results === Network topology: +--+ +--+ | | | | | NPX8 | |NPX3 | | (em1)+ = +(em3) | | | | | | | |(em0) | +--+ +--+---+ I I I I I I I I +--+---+ |(em0) | | | |NPX6 | | | | | +--+ NPX8: em1: 172.16.38.80/24 em1: flags=8843 metric 0 mtu 1500 options=19b ether 00:1c:c4:48:93:10 inet 172.16.38.80 netmask 0xff00 broadcast 172.16.38.255 media: Ethernet autoselect (1000baseT ) status: active NPX3: em3: 172.16.38.30/24 em3: flags=8843 metric 0 mtu 1500 options=19b ether 00:1f:29:5f:c6:aa inet 172.16.38.30 netmask 0xff00 broadcast 172.16.38.255 media: Ethernet autoselect (1000baseT ) status: active em0: 172.16.13.30/24 em0: flags=8843 metric 0 mtu 1500 options=19b ether 00:1f:29:5f:c6:a9 inet 172.16.13.30 netmask 0xff00 broadcast 172.16.13.255 media: Ethernet autoselect (1000baseT ) status: active NPX6: em0: 172.16.13.60/24 em0: flags=8843 metric 0 mtu 1500 options=19b ether 00:1c:c4:48:95:d1 inet 172.16.13.60 netmask 0xff00 broadcast 172.16.13.255 media: Ethernet autoselect (1000baseT ) status: active NPX8 IPv4 Routing table npx8# netstat -nr Routing tables Internet: DestinationGatewayFlagsRefs Use Ne
Re: AltQ throughput issues (long message)
See my responses inline - PLM On 07/30/2010 04:30 PM, Luigi Rizzo wrote: On Fri, Jul 30, 2010 at 04:07:04PM -0700, Patrick Mahan wrote: All, I am looking for (again) some understanding of AltQ and how it works w.r.t. packet through put. I posted earlier this month regarding how to initially configure AltQ (thanks to everyone's help) and now have it working over the em(4) drive on a FreeBSD 8.0 platform (HP DL350 G5). I had to bring the em(4) driver from the 8-Stable branch, but it is working just fine so far (needed to add the drbr_needs_enqueue() to if_var.h). I have now gone back to trying to setup up one queue with a bandwith of 1900 Kbs (1.9 Mbs). I ran a test with 'iperf' using udp and setting the bandwidth to 25 Mbs. I then ran a test setting the queue bandwith to 20 Mbs and running 'iperf' again using udp and 25 Mbs bandwith. In both cases, the throughput only seems to be 89% of the requested throughput. part of it can be explained because AltQ counts the whole packet (eg. 1514 bytes for a full frame) whereas iperf only considers the UDP payload (e.g. 1470 bytes in your case). Okay, but that only accounts for 3% and I am seeing around 11%, any idea what might be accounting for the remaining 8%? The other thing you should check is whether there is any extra traffic going through the interface that competes for the bottleneck bandwidth. You have such huge drop rates in your tests that i would not be surprised if you had ICMP packets going around trying to slow down the sender. No extra traffic. All machines have a bce0 (not shown in the diagram) that acts as the management port on a 10.10.0.0 network. Neither NPX8 nor NPX6 has forwarding enabled. But that said NPX3 where the AltQ is enabled does have ip forwarding enabled. But should not be trying to route any packets incoming on the bce0, but I will need to confirm. Will running pfctl before starting iperf for a few seconds be enough? Should I create a second queue to capture these possible 'extra' packets? As for the em(4) interfaces, they are all connected directly to each other using cat 6 cross-over cables, no hub/switch involved. Where do you see the drop? If you are looking at the end of the pfctl output that is probably occurring after iperf has finished its run. I have noticed that the queue bandwidth numbers sharply decline after iperf has finished? BTW have you tried dummynet in your config? How would you suggest using dummynet? Is it workable for a QoS solution? Thanks as always, Patrick cheers luigi Test 1 AltQ queue bandwidth 1.9 Mbs, iperf -b 25M pfctl -vv -s queue reported: queue root_em0 on em0 bandwidth 1Gb priority 0 cbq( wrr root ) {test7788} [ pkts: 28298 bytes: 42771988 dropped pkts: 0 bytes: 0 ] [ qlength: 0/ 50 borrows: 0 suspends: 0 ] [ measured: 140.8 packets/s, 1.70Mb/s ] queue test7788 on em0 bandwidth 1.90Mb cbq( default ) [ pkts: 28298 bytes: 42771988 dropped pkts: 397077 bytes: 600380424 ] [ qlength: 50/ 50 borrows: 0 suspends: 3278 ] [ measured: 140.8 packets/s, 1.70Mb/s ] iperf reported [ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams [ 3] 0.0-200.4 sec 39.7 MBytes 1.66 Mbits/sec 6.998 ms 397190/425533 (93%) Test 2 AltQ queue bandwidth 20 Mbs, iperf -b 25M pfctl -vv -s queue reported: queue root_em0 on em0 bandwidth 1Gb priority 0 cbq( wrr root ) {test7788} [ pkts: 356702 bytes: 539329126 dropped pkts: 0 bytes: 0 ] [ qlength: 0/ 50 borrows: 0 suspends: 0 ] [ measured: 1500.2 packets/s, 18.15Mb/s ] queue test7788 on em0 bandwidth 20Mb cbq( default ) [ pkts: 356702 bytes: 539329126 dropped pkts: 149198 bytes: 225587376 ] [ qlength: 46/ 50 borrows: 0 suspends: 39629 ] [ measured: 1500.2 packets/s, 18.15Mb/s ] iperf reported [ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams [ 3] 0.0-240.0 sec505 MBytes 17.6 Mbits/sec 0.918 ms 150584/510637 (29%) Why can AltQ not drive it at full bandwidth? This is just some preliminary testing, but I want to scale this up to use all available AltQ CBQ queues for various operations. As always, my knowledge is always increased with I ask questions on this list. Thanks, Patrick === Test Results === Network topology: +--+ +--+ | | | | | NPX8 | |NPX3 | | (em1)+ = +(em3) | | | | | | | |(em0) | +--+ +--+---+ I