Re: dummynet dropping too many packets

2009-10-17 Thread Julian Elischer

rihad wrote:



The change definitely helped! There are now more than 3200 users online, 
460-500 mbps net traffic load, and normally 10-60 (up to 150 once or 
twice) consistent drops per second as opposed to several hundred up to 
1000-1500 packets dropped per second before the rebuild. What's 
interesting is that the drops now began only after the ipfw table had 
around 3000 entries, not 2000 like before, so the change definitely 
helped. Just how high can maxlen be? Should I try 2048? 4096?


is Hz still 4000?
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: dummynet dropping too many packets

2009-10-17 Thread Robert Watson


On Sat, 17 Oct 2009, rihad wrote:

P.S.: BTW, there's a small admin-type inconsistency in FreeBSD 7.1: 
/etc/rc.firewall gets executed before values set by /etc/sysctl.conf are in 
effect, so "queue 2000" isn't allowed in ipfw pipe rules (as 
net.inet.ip.dummynet.pipe_slot_limit is only 100 by default), so the rules 
are silently failing without any trace in the log files - I only saw the 
errors at the console.


This is awkward to fix for sysctls, because the firewall module may not be 
loaded until the firewall stage of the boot process, so the sysctl wouldn't 
take effect (and perhaps this is what you're seeing, in fact?).


Some sysctls have associated loader tunables, which you can set in 
/boot/loader.conf (and affect configuration when the module is loaded), but it 
looks like that isn't true for net.inet.ip.dummynet.pipe_slot_limit.


Robert N M Watson
Computer Laboratory
University of Cambridge
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: dummynet dropping too many packets

2009-10-17 Thread Robert Watson


On Sat, 17 Oct 2009, rihad wrote:

Just rebooted with the "ifp->if_snd.ifq_drv_maxlen = 1024;" kernel, all ok 
so far. There's currenlty only 1000 or so entries in the ipfw table and 
around 350-400 net mbps load, so I'll wait a few hours for the numbers to 
grow to >2000 and 460-480 respectively and see if the drops still occur.


The change definitely helped! There are now more than 3200 users online, 
460-500 mbps net traffic load, and normally 10-60 (up to 150 once or twice) 
consistent drops per second as opposed to several hundred up to 1000-1500 
packets dropped per second before the rebuild. What's interesting is that 
the drops now began only after the ipfw table had around 3000 entries, not 
2000 like before, so the change definitely helped. Just how high can maxlen 
be? Should I try 2048? 4096?


Sure, those should both be safe to use in your configuration, although as the 
numbers get higher, potential kernel memory use increases, as does the risk of 
starvation for clusters.  Keep an eye on "netstat -m" errors to see if you are 
reaching configured resouce limits (which you've probably increased already).


Robert
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: kern/137776: [rum] panic in rum(4) driver on 8.0-BETA2

2009-10-17 Thread O.Herold
The following reply was made to PR kern/137776; it has been noted by GNATS.

From: "O.Herold" 
To: bug-follo...@freebsd.org, f...@freebsd.org
Cc:  
Subject: Re: kern/137776: [rum] panic in rum(4) driver on 8.0-BETA2
Date: Sat, 17 Oct 2009 11:38:35 +0200

 Hi,
 
 there is a fix for this kind of bug. I tried it myself (FreeBSD 8.0 RC1)  
 and it works like a charm. I had a stable connection without any panic  
 (the first one since using if_rum driver in FreeBSD; see the PRs) for  
 several hours while downloading and installing different packages on a new  
 system.
 
 http://lists.freebsd.org/pipermail/freebsd-current/2009-October/012659.html
 
 Would be nice to see this fix in stable, I think it's too late for the  
 release.
 
 Cheers, Oliver Herold
 
 -- 
 F!XMBR:http://www.fixmbr.de
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: dummynet dropping too many packets

2009-10-17 Thread rihad

Julian Elischer wrote:

rihad wrote:



The change definitely helped! There are now more than 3200 users 
online, 460-500 mbps net traffic load, and normally 10-60 (up to 150 
once or twice) consistent drops per second as opposed to several 
hundred up to 1000-1500 packets dropped per second before the rebuild. 
What's interesting is that the drops now began only after the ipfw 
table had around 3000 entries, not 2000 like before, so the change 
definitely helped. Just how high can maxlen be? Should I try 2048? 4096?


is Hz still 4000?

No, I've set it to 2000 as per recommendations for HZ in NOTES. Should I 
try 4000? 6000? 8000? Or maybe just increase the bce queue length and 
rebuild? :)


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: dummynet dropping too many packets

2009-10-17 Thread rihad

rihad wrote:
Just rebooted with the "ifp->if_snd.ifq_drv_maxlen = 1024;" kernel, all 
ok so far. There's currenlty only 1000 or so entries in the ipfw table 
and around 350-400 net mbps load, so I'll wait a few hours for the 
numbers to grow to >2000 and 460-480 respectively and see if the drops 
still occur.




I'm not sure of anything now... It's 7 p.m. here, and during this busy 
time of day in terms of network use there are 350-500 up to 600 drops 
per second at around 530-550 mbps net load. This is roughly equivalent 
to 2-7 mbps dropped on output. It might be better than before. Next 
thing I'll try is bce queue maxlen 1024 -> 2048, and HZ 2000 again 
"back" to 4000.

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: dummynet dropping too many packets

2009-10-17 Thread rihad

Robert Watson wrote:


On Sat, 17 Oct 2009, rihad wrote:

P.S.: BTW, there's a small admin-type inconsistency in FreeBSD 7.1: 
/etc/rc.firewall gets executed before values set by /etc/sysctl.conf 
are in effect, so "queue 2000" isn't allowed in ipfw pipe rules (as 
net.inet.ip.dummynet.pipe_slot_limit is only 100 by default), so the 
rules are silently failing without any trace in the log files - I only 
saw the errors at the console.


This is awkward to fix for sysctls, because the firewall module may not 
be loaded until the firewall stage of the boot process, so the sysctl 
wouldn't take effect (and perhaps this is what you're seeing, in fact?).


Well, my kernel is built with IPFIREWALL enabled, so ipfw module is 
unneeded and doesn't get loaded automatically. I rather still think it's 
the order of execution that matters.
For that matter I've worked around the problem for now by setting the 
sysctls explicitly in /etc/rc.firewall right before configuring the pipes:

/sbin/sysctl net.inet.ip.dummynet.hash_size=512
/sbin/sysctl net.inet.ip.dummynet.pipe_slot_limit=2000
and commented them out in /etc/sysctl.conf with an XXX

Now I see that this is also the reason why setting 
net.inet.ip.dummynet.hash_size in sysctl.conf had no effect on the hash 
table size at the time of creation of the pipes.


Some sysctls have associated loader tunables, which you can set in 
/boot/loader.conf (and affect configuration when the module is loaded), 
but it looks like that isn't true for net.inet.ip.dummynet.pipe_slot_limit.


Robert N M Watson
Computer Laboratory
University of Cambridge




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: dummynet dropping too many packets

2009-10-17 Thread Peter Jeremy
On 2009-Oct-04 18:47:23 +0500, rihad  wrote:
>Hi, we have around 500-600 mbit/s traffic flowing through a 7.1R Dell 
>PowerEdge w/ 2 GigE bce cards. There are currently around 4 thousand ISP 
>users online limited by dummynet pipes of various speeds. According to 
>netstat -s output around 500-1000 packets are being dropped every second 
>(this accounts for wasting around 7-12 mbit/s worth of traffic according 
>to systat -ifstat):

This has been a most interesting thread.  A couple of comments:

Traffic shaping only works cleanly on TCP flows - UDP has no feedback
mechanism and so will not automatically throttle to fit into the
available bandwidth, potentially leading to high packet drops within
dummynet.  Is it possible that some of your customers are heavily
using UDP?  Have you tried allowing just UDP traffic to bypass the
pipes to see if this has any effect on drop rate?

The pipe lists you posted showed that virtually all the packet drops
are associated with one or two IP addresses.  If this is really true,
rather than a measurement artifact, you might find it useful to
tcpdump those addresses and see if there's anything unusual in the
data being passed.  Also, if you monitor the pipe lists following a
cold start, do those addresses appear early and just not show any
packet loss until the total number of users builds up or do they not
appear until later and immediately show packet loss?

Looking at how 'output packets dropped due to no bufs, etc.' is
counted (ipstat.ips_odropped), if you run 'netstat -id', do you see a
large number of drops on bce1 (consistent with the "output packets
dropped" counts) or not?  This will help narrow down the codepath
being followed by dropped packets.

Since the problem only appears to manifest when table(0) exceeds 2000
entries, have you considered splitting (at least temporarily) that
table (and possibly table(2)) into two (eg table(0) and table(4))?
This would help rule out an (unlikely) problem with table sizes.
Doin so would require the application to split the users across both
tables (eg round-robin or based on one of the bits in the IP address)
and then duplicating the relevant ipfw rules - eg:

01060 pipe tablearg ip from any to table(0) out recv bce0 xmit bce1
01061 pipe tablearg ip from any to table(4) out recv bce0 xmit bce1
01070 allow ip from any to table(0) out recv bce0 xmit bce1
01071 allow ip from any to table(4) out recv bce0 xmit bce1

(And I agree that re-arranging rules to reduce the number of repeated
tests should improve ipfw efficiency).

The symptoms keep making me think "lock contention" - but I'm not sure
how to measure that cheaply (AFAIK, LOCK_PROFILING is comparatively
expensive).

Finally, are you running i386 or amd64?

-- 
Peter Jeremy


pgpGJqTOzuPXQ.pgp
Description: PGP signature


Re: dummynet dropping too many packets

2009-10-17 Thread Julian Elischer

rihad wrote:

Julian Elischer wrote:

rihad wrote:



The change definitely helped! There are now more than 3200 users 
online, 460-500 mbps net traffic load, and normally 10-60 (up to 150 
once or twice) consistent drops per second as opposed to several 
hundred up to 1000-1500 packets dropped per second before the 
rebuild. What's interesting is that the drops now began only after 
the ipfw table had around 3000 entries, not 2000 like before, so the 
change definitely helped. Just how high can maxlen be? Should I try 
2048? 4096?


is Hz still 4000?

No, I've set it to 2000 as per recommendations for HZ in NOTES. Should I 
try 4000? 6000? 8000? Or maybe just increase the bce queue length and 
rebuild? :)


you could try combinations.



___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Native support for AutoIP (aka LLA, RFC 3927).

2009-10-17 Thread David Horn
On Fri, Oct 16, 2009 at 4:38 PM, Martin Garon  wrote:
> Hi,
>
>
>
> I need to implement AutoIP in my embedded FW that uses a snapshot of FreeBSD
> 4.4 network stack.
>
>
>
> I could not find any support for it in the latest development cvs tree. Any
> chance it is somewhere that I missed?
>
>
>
> If there is no support, anyone could suggest a good approach to this? I am
> thinking porting libpcap in order to access the data link layer to
> intercept/inject some ARP packets.
>
>
>
> All comments welcomed,
>

Check out the Avahi implementation of IPv4 Link Local (RFC 3927).  In
ports under net/avahi-autoipd

Good Luck

-_Dave H
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Page fault in IFNET_WLOCK_ASSERT [if.c and pccbb.c]

2009-10-17 Thread Harsha
Hi Robert,

Apologies for not getting earlier.

On Mon, Oct 12, 2009 at 6:46 AM, Robert N. M. Watson
 wrote:
>
> Looks like a NULL pointer dereference, so perhaps a more traditional bug --
> could you convert ifindex_alloc_locked+0x71 to a line of code? You can do
> this using kgdb on the kernel symbols file, perhaps "l
> *ifindex_alloc_locked+0x71".
It is the for loop in ifindex_alloc_locked() function-
 for (idx = 1; idx <= V_if_index; idx++)

idx is a local variable, so I figured it is V_if_index is what is
causing the page fault. It does look like a NULL pointer reference - I
see that V_if_index comes from that  vnet instance's value and uses
the macro VNET_VNET_PTR() down the chain. Since the call chain is
coming from a new thread cbb_event_thread, I believe that this
thread's vnet context needs to be set using CURVNET_SET().

I'll try this tomorrow, but if think I'm not on the right track or
want me to try something else please let me know.

Many thanks,
Harsha
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: dummynet dropping too many packets

2009-10-17 Thread rihad

Peter Jeremy wrote:

On 2009-Oct-04 18:47:23 +0500, rihad  wrote:

Hi, we have around 500-600 mbit/s traffic flowing through a 7.1R
Dell PowerEdge w/ 2 GigE bce cards. There are currently around 4
thousand ISP users online limited by dummynet pipes of various
speeds. According to netstat -s output around 500-1000 packets are
being dropped every second (this accounts for wasting around 7-12
mbit/s worth of traffic according to systat -ifstat):


This has been a most interesting thread.  A couple of comments:

Traffic shaping only works cleanly on TCP flows - UDP has no feedback
 mechanism and so will not automatically throttle to fit into the 
available bandwidth, potentially leading to high packet drops within 
dummynet.  Is it possible that some of your customers are heavily 
using UDP? Have you tried allowing just UDP traffic to bypass the

pipes to see if this has any effect on drop rate?

We only process inbound traffic, and anyway this problem couldn't be
related because net.inet.ip.dummynet.io_pkt_drop normally doesn't
reflect netstat -s's "output packets dropped" pace (e.g. now the
former's only 1048, but the latter is as much as 1272587).


The pipe lists you posted showed that virtually all the packet drops 
are associated with one or two IP addresses.  If this is really true,
Not really. There were only a few hundred of the several thousand online 
users in the list. Besides those drops are within sane limits (as 
determined by io_pkt_drop sysctl), it's the netstat -s's output packet 
drops that matter.


Also, if you monitor the pipe lists following a 
cold start, do those addresses appear early and just not show any 
packet loss until the total number of users builds up or do they not 
appear until later and immediately show packet loss?


io_pkt_drop may rise at certain well-defined periods, like when turning 
dummynet on (by deleting the "allow ip from any to any" line before the 
pipes), and it may rise for certain heavy downloaders, but the value is 
normally negligible.


Looking at how 'output packets dropped due to no bufs, etc.' is 
counted (ipstat.ips_odropped), if you run 'netstat -id', do you see a
 large number of drops on bce1 (consistent with the "output packets 
dropped" counts) or not?  This will help narrow down the codepath 
being followed by dropped packets.

netstat -id:
Yup, it's comparable:
bce0   1500   00:1d:09:2a:06:7f 5518562854 0 14327023 
  0 00
bce1   1500   00:1d:09:xx:xx:xx   144918 0 5498628928 
  0 0 1135438

netstat -s:
1272587 output packets dropped due to no bufs, etc.




Since the problem only appears to manifest when table(0) exceeds 2000
 entries, have you considered splitting (at least temporarily) that 
table (and possibly table(2)) into two (eg table(0) and table(4))? 
This would help rule out an (unlikely) problem with table sizes. Doin
so would require the application to split the users across both 
tables (eg round-robin or based on one of the bits in the IP address)

 and then duplicating the relevant ipfw rules - eg:

01060 pipe tablearg ip from any to table(0) out recv bce0 xmit bce1 
01061 pipe tablearg ip from any to table(4) out recv bce0 xmit bce1 
01070 allow ip from any to table(0) out recv bce0 xmit bce1 01071

allow ip from any to table(4) out recv bce0 xmit bce1



Around 3000 now (and around 480-500 mbps) as I've set the queue length 
in bce to 1024 and rebuilt the kernel. I'm going to increase that a bit 
again. I really think it's the dummynet burstiness, not table size per 
se, that results in the drops, and the value of burstiness depends on 
the number of "online" users. A command as simple as "ipfw table 0 
flush" stops all drops instantly, but still allowing that traffic to 
pass through as is (thank God). It's quite easy for me to simulate the 
split in two by doing some shell scripting without touching any code, 
but I don't think it's the table sizes. I'll try that in case increasing 
the bce maxlen value won't help, though, so thank you.



(And I agree that re-arranging rules to reduce the number of repeated
 tests should improve ipfw efficiency).

The symptoms keep making me think "lock contention" - but I'm not
sure how to measure that cheaply (AFAIK, LOCK_PROFILING is
comparatively expensive).

Finally, are you running i386 or amd64?



___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Page fault in IFNET_WLOCK_ASSERT [if.c and pccbb.c]

2009-10-17 Thread Julian Elischer

Harsha wrote:

Hi Robert,

Apologies for not getting earlier.

On Mon, Oct 12, 2009 at 6:46 AM, Robert N. M. Watson
 wrote:

Looks like a NULL pointer dereference, so perhaps a more traditional bug --
could you convert ifindex_alloc_locked+0x71 to a line of code? You can do
this using kgdb on the kernel symbols file, perhaps "l
*ifindex_alloc_locked+0x71".

It is the for loop in ifindex_alloc_locked() function-
 for (idx = 1; idx <= V_if_index; idx++)

idx is a local variable, so I figured it is V_if_index is what is
causing the page fault. It does look like a NULL pointer reference - I
see that V_if_index comes from that  vnet instance's value and uses
the macro VNET_VNET_PTR() down the chain. Since the call chain is
coming from a new thread cbb_event_thread, I believe that this
thread's vnet context needs to be set using CURVNET_SET().


but only if you have options VIMAGE defined. if not then CURVNET_SET()
is a NOP




I'll try this tomorrow, but if think I'm not on the right track or
want me to try something else please let me know.

Many thanks,
Harsha
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"