Re: dummynet dropping too many packets
rihad wrote: The change definitely helped! There are now more than 3200 users online, 460-500 mbps net traffic load, and normally 10-60 (up to 150 once or twice) consistent drops per second as opposed to several hundred up to 1000-1500 packets dropped per second before the rebuild. What's interesting is that the drops now began only after the ipfw table had around 3000 entries, not 2000 like before, so the change definitely helped. Just how high can maxlen be? Should I try 2048? 4096? is Hz still 4000? ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: dummynet dropping too many packets
On Sat, 17 Oct 2009, rihad wrote: P.S.: BTW, there's a small admin-type inconsistency in FreeBSD 7.1: /etc/rc.firewall gets executed before values set by /etc/sysctl.conf are in effect, so "queue 2000" isn't allowed in ipfw pipe rules (as net.inet.ip.dummynet.pipe_slot_limit is only 100 by default), so the rules are silently failing without any trace in the log files - I only saw the errors at the console. This is awkward to fix for sysctls, because the firewall module may not be loaded until the firewall stage of the boot process, so the sysctl wouldn't take effect (and perhaps this is what you're seeing, in fact?). Some sysctls have associated loader tunables, which you can set in /boot/loader.conf (and affect configuration when the module is loaded), but it looks like that isn't true for net.inet.ip.dummynet.pipe_slot_limit. Robert N M Watson Computer Laboratory University of Cambridge ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: dummynet dropping too many packets
On Sat, 17 Oct 2009, rihad wrote: Just rebooted with the "ifp->if_snd.ifq_drv_maxlen = 1024;" kernel, all ok so far. There's currenlty only 1000 or so entries in the ipfw table and around 350-400 net mbps load, so I'll wait a few hours for the numbers to grow to >2000 and 460-480 respectively and see if the drops still occur. The change definitely helped! There are now more than 3200 users online, 460-500 mbps net traffic load, and normally 10-60 (up to 150 once or twice) consistent drops per second as opposed to several hundred up to 1000-1500 packets dropped per second before the rebuild. What's interesting is that the drops now began only after the ipfw table had around 3000 entries, not 2000 like before, so the change definitely helped. Just how high can maxlen be? Should I try 2048? 4096? Sure, those should both be safe to use in your configuration, although as the numbers get higher, potential kernel memory use increases, as does the risk of starvation for clusters. Keep an eye on "netstat -m" errors to see if you are reaching configured resouce limits (which you've probably increased already). Robert ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kern/137776: [rum] panic in rum(4) driver on 8.0-BETA2
The following reply was made to PR kern/137776; it has been noted by GNATS. From: "O.Herold" To: bug-follo...@freebsd.org, f...@freebsd.org Cc: Subject: Re: kern/137776: [rum] panic in rum(4) driver on 8.0-BETA2 Date: Sat, 17 Oct 2009 11:38:35 +0200 Hi, there is a fix for this kind of bug. I tried it myself (FreeBSD 8.0 RC1) and it works like a charm. I had a stable connection without any panic (the first one since using if_rum driver in FreeBSD; see the PRs) for several hours while downloading and installing different packages on a new system. http://lists.freebsd.org/pipermail/freebsd-current/2009-October/012659.html Would be nice to see this fix in stable, I think it's too late for the release. Cheers, Oliver Herold -- F!XMBR:http://www.fixmbr.de ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: dummynet dropping too many packets
Julian Elischer wrote: rihad wrote: The change definitely helped! There are now more than 3200 users online, 460-500 mbps net traffic load, and normally 10-60 (up to 150 once or twice) consistent drops per second as opposed to several hundred up to 1000-1500 packets dropped per second before the rebuild. What's interesting is that the drops now began only after the ipfw table had around 3000 entries, not 2000 like before, so the change definitely helped. Just how high can maxlen be? Should I try 2048? 4096? is Hz still 4000? No, I've set it to 2000 as per recommendations for HZ in NOTES. Should I try 4000? 6000? 8000? Or maybe just increase the bce queue length and rebuild? :) ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: dummynet dropping too many packets
rihad wrote: Just rebooted with the "ifp->if_snd.ifq_drv_maxlen = 1024;" kernel, all ok so far. There's currenlty only 1000 or so entries in the ipfw table and around 350-400 net mbps load, so I'll wait a few hours for the numbers to grow to >2000 and 460-480 respectively and see if the drops still occur. I'm not sure of anything now... It's 7 p.m. here, and during this busy time of day in terms of network use there are 350-500 up to 600 drops per second at around 530-550 mbps net load. This is roughly equivalent to 2-7 mbps dropped on output. It might be better than before. Next thing I'll try is bce queue maxlen 1024 -> 2048, and HZ 2000 again "back" to 4000. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: dummynet dropping too many packets
Robert Watson wrote: On Sat, 17 Oct 2009, rihad wrote: P.S.: BTW, there's a small admin-type inconsistency in FreeBSD 7.1: /etc/rc.firewall gets executed before values set by /etc/sysctl.conf are in effect, so "queue 2000" isn't allowed in ipfw pipe rules (as net.inet.ip.dummynet.pipe_slot_limit is only 100 by default), so the rules are silently failing without any trace in the log files - I only saw the errors at the console. This is awkward to fix for sysctls, because the firewall module may not be loaded until the firewall stage of the boot process, so the sysctl wouldn't take effect (and perhaps this is what you're seeing, in fact?). Well, my kernel is built with IPFIREWALL enabled, so ipfw module is unneeded and doesn't get loaded automatically. I rather still think it's the order of execution that matters. For that matter I've worked around the problem for now by setting the sysctls explicitly in /etc/rc.firewall right before configuring the pipes: /sbin/sysctl net.inet.ip.dummynet.hash_size=512 /sbin/sysctl net.inet.ip.dummynet.pipe_slot_limit=2000 and commented them out in /etc/sysctl.conf with an XXX Now I see that this is also the reason why setting net.inet.ip.dummynet.hash_size in sysctl.conf had no effect on the hash table size at the time of creation of the pipes. Some sysctls have associated loader tunables, which you can set in /boot/loader.conf (and affect configuration when the module is loaded), but it looks like that isn't true for net.inet.ip.dummynet.pipe_slot_limit. Robert N M Watson Computer Laboratory University of Cambridge ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: dummynet dropping too many packets
On 2009-Oct-04 18:47:23 +0500, rihad wrote: >Hi, we have around 500-600 mbit/s traffic flowing through a 7.1R Dell >PowerEdge w/ 2 GigE bce cards. There are currently around 4 thousand ISP >users online limited by dummynet pipes of various speeds. According to >netstat -s output around 500-1000 packets are being dropped every second >(this accounts for wasting around 7-12 mbit/s worth of traffic according >to systat -ifstat): This has been a most interesting thread. A couple of comments: Traffic shaping only works cleanly on TCP flows - UDP has no feedback mechanism and so will not automatically throttle to fit into the available bandwidth, potentially leading to high packet drops within dummynet. Is it possible that some of your customers are heavily using UDP? Have you tried allowing just UDP traffic to bypass the pipes to see if this has any effect on drop rate? The pipe lists you posted showed that virtually all the packet drops are associated with one or two IP addresses. If this is really true, rather than a measurement artifact, you might find it useful to tcpdump those addresses and see if there's anything unusual in the data being passed. Also, if you monitor the pipe lists following a cold start, do those addresses appear early and just not show any packet loss until the total number of users builds up or do they not appear until later and immediately show packet loss? Looking at how 'output packets dropped due to no bufs, etc.' is counted (ipstat.ips_odropped), if you run 'netstat -id', do you see a large number of drops on bce1 (consistent with the "output packets dropped" counts) or not? This will help narrow down the codepath being followed by dropped packets. Since the problem only appears to manifest when table(0) exceeds 2000 entries, have you considered splitting (at least temporarily) that table (and possibly table(2)) into two (eg table(0) and table(4))? This would help rule out an (unlikely) problem with table sizes. Doin so would require the application to split the users across both tables (eg round-robin or based on one of the bits in the IP address) and then duplicating the relevant ipfw rules - eg: 01060 pipe tablearg ip from any to table(0) out recv bce0 xmit bce1 01061 pipe tablearg ip from any to table(4) out recv bce0 xmit bce1 01070 allow ip from any to table(0) out recv bce0 xmit bce1 01071 allow ip from any to table(4) out recv bce0 xmit bce1 (And I agree that re-arranging rules to reduce the number of repeated tests should improve ipfw efficiency). The symptoms keep making me think "lock contention" - but I'm not sure how to measure that cheaply (AFAIK, LOCK_PROFILING is comparatively expensive). Finally, are you running i386 or amd64? -- Peter Jeremy pgpGJqTOzuPXQ.pgp Description: PGP signature
Re: dummynet dropping too many packets
rihad wrote: Julian Elischer wrote: rihad wrote: The change definitely helped! There are now more than 3200 users online, 460-500 mbps net traffic load, and normally 10-60 (up to 150 once or twice) consistent drops per second as opposed to several hundred up to 1000-1500 packets dropped per second before the rebuild. What's interesting is that the drops now began only after the ipfw table had around 3000 entries, not 2000 like before, so the change definitely helped. Just how high can maxlen be? Should I try 2048? 4096? is Hz still 4000? No, I've set it to 2000 as per recommendations for HZ in NOTES. Should I try 4000? 6000? 8000? Or maybe just increase the bce queue length and rebuild? :) you could try combinations. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Native support for AutoIP (aka LLA, RFC 3927).
On Fri, Oct 16, 2009 at 4:38 PM, Martin Garon wrote: > Hi, > > > > I need to implement AutoIP in my embedded FW that uses a snapshot of FreeBSD > 4.4 network stack. > > > > I could not find any support for it in the latest development cvs tree. Any > chance it is somewhere that I missed? > > > > If there is no support, anyone could suggest a good approach to this? I am > thinking porting libpcap in order to access the data link layer to > intercept/inject some ARP packets. > > > > All comments welcomed, > Check out the Avahi implementation of IPv4 Link Local (RFC 3927). In ports under net/avahi-autoipd Good Luck -_Dave H ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Page fault in IFNET_WLOCK_ASSERT [if.c and pccbb.c]
Hi Robert, Apologies for not getting earlier. On Mon, Oct 12, 2009 at 6:46 AM, Robert N. M. Watson wrote: > > Looks like a NULL pointer dereference, so perhaps a more traditional bug -- > could you convert ifindex_alloc_locked+0x71 to a line of code? You can do > this using kgdb on the kernel symbols file, perhaps "l > *ifindex_alloc_locked+0x71". It is the for loop in ifindex_alloc_locked() function- for (idx = 1; idx <= V_if_index; idx++) idx is a local variable, so I figured it is V_if_index is what is causing the page fault. It does look like a NULL pointer reference - I see that V_if_index comes from that vnet instance's value and uses the macro VNET_VNET_PTR() down the chain. Since the call chain is coming from a new thread cbb_event_thread, I believe that this thread's vnet context needs to be set using CURVNET_SET(). I'll try this tomorrow, but if think I'm not on the right track or want me to try something else please let me know. Many thanks, Harsha ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: dummynet dropping too many packets
Peter Jeremy wrote: On 2009-Oct-04 18:47:23 +0500, rihad wrote: Hi, we have around 500-600 mbit/s traffic flowing through a 7.1R Dell PowerEdge w/ 2 GigE bce cards. There are currently around 4 thousand ISP users online limited by dummynet pipes of various speeds. According to netstat -s output around 500-1000 packets are being dropped every second (this accounts for wasting around 7-12 mbit/s worth of traffic according to systat -ifstat): This has been a most interesting thread. A couple of comments: Traffic shaping only works cleanly on TCP flows - UDP has no feedback mechanism and so will not automatically throttle to fit into the available bandwidth, potentially leading to high packet drops within dummynet. Is it possible that some of your customers are heavily using UDP? Have you tried allowing just UDP traffic to bypass the pipes to see if this has any effect on drop rate? We only process inbound traffic, and anyway this problem couldn't be related because net.inet.ip.dummynet.io_pkt_drop normally doesn't reflect netstat -s's "output packets dropped" pace (e.g. now the former's only 1048, but the latter is as much as 1272587). The pipe lists you posted showed that virtually all the packet drops are associated with one or two IP addresses. If this is really true, Not really. There were only a few hundred of the several thousand online users in the list. Besides those drops are within sane limits (as determined by io_pkt_drop sysctl), it's the netstat -s's output packet drops that matter. Also, if you monitor the pipe lists following a cold start, do those addresses appear early and just not show any packet loss until the total number of users builds up or do they not appear until later and immediately show packet loss? io_pkt_drop may rise at certain well-defined periods, like when turning dummynet on (by deleting the "allow ip from any to any" line before the pipes), and it may rise for certain heavy downloaders, but the value is normally negligible. Looking at how 'output packets dropped due to no bufs, etc.' is counted (ipstat.ips_odropped), if you run 'netstat -id', do you see a large number of drops on bce1 (consistent with the "output packets dropped" counts) or not? This will help narrow down the codepath being followed by dropped packets. netstat -id: Yup, it's comparable: bce0 1500 00:1d:09:2a:06:7f 5518562854 0 14327023 0 00 bce1 1500 00:1d:09:xx:xx:xx 144918 0 5498628928 0 0 1135438 netstat -s: 1272587 output packets dropped due to no bufs, etc. Since the problem only appears to manifest when table(0) exceeds 2000 entries, have you considered splitting (at least temporarily) that table (and possibly table(2)) into two (eg table(0) and table(4))? This would help rule out an (unlikely) problem with table sizes. Doin so would require the application to split the users across both tables (eg round-robin or based on one of the bits in the IP address) and then duplicating the relevant ipfw rules - eg: 01060 pipe tablearg ip from any to table(0) out recv bce0 xmit bce1 01061 pipe tablearg ip from any to table(4) out recv bce0 xmit bce1 01070 allow ip from any to table(0) out recv bce0 xmit bce1 01071 allow ip from any to table(4) out recv bce0 xmit bce1 Around 3000 now (and around 480-500 mbps) as I've set the queue length in bce to 1024 and rebuilt the kernel. I'm going to increase that a bit again. I really think it's the dummynet burstiness, not table size per se, that results in the drops, and the value of burstiness depends on the number of "online" users. A command as simple as "ipfw table 0 flush" stops all drops instantly, but still allowing that traffic to pass through as is (thank God). It's quite easy for me to simulate the split in two by doing some shell scripting without touching any code, but I don't think it's the table sizes. I'll try that in case increasing the bce maxlen value won't help, though, so thank you. (And I agree that re-arranging rules to reduce the number of repeated tests should improve ipfw efficiency). The symptoms keep making me think "lock contention" - but I'm not sure how to measure that cheaply (AFAIK, LOCK_PROFILING is comparatively expensive). Finally, are you running i386 or amd64? ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Page fault in IFNET_WLOCK_ASSERT [if.c and pccbb.c]
Harsha wrote: Hi Robert, Apologies for not getting earlier. On Mon, Oct 12, 2009 at 6:46 AM, Robert N. M. Watson wrote: Looks like a NULL pointer dereference, so perhaps a more traditional bug -- could you convert ifindex_alloc_locked+0x71 to a line of code? You can do this using kgdb on the kernel symbols file, perhaps "l *ifindex_alloc_locked+0x71". It is the for loop in ifindex_alloc_locked() function- for (idx = 1; idx <= V_if_index; idx++) idx is a local variable, so I figured it is V_if_index is what is causing the page fault. It does look like a NULL pointer reference - I see that V_if_index comes from that vnet instance's value and uses the macro VNET_VNET_PTR() down the chain. Since the call chain is coming from a new thread cbb_event_thread, I believe that this thread's vnet context needs to be set using CURVNET_SET(). but only if you have options VIMAGE defined. if not then CURVNET_SET() is a NOP I'll try this tomorrow, but if think I'm not on the right track or want me to try something else please let me know. Many thanks, Harsha ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"