Re: bridge and stp defaults
On Dec 14, 2007 1:24 AM, Niki Denev <[EMAIL PROTECTED]> wrote: > Hi, > > Is there a reason that when adding member ports to a bridge stp is not > enabled by default on them? > Wouldn't it be more intuitive to be enabled by default these days? There are several reasons not to enable STP on a bridge port unless you're absolutely aware of what's happening here: http://unilans.net/phrack/61/p61-0x0c_Fun_with_Spanning_Tree_Protocol.txt > Regards, > Niki > ___ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "[EMAIL PROTECTED]" > -- "UNIX is basically a simple operating system, but you have to be a genius to understand the simplicity." Dennis Ritchie ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
WOL suport in Broadcom 5721 (57XX)
Hi All, Just a check whether WOL is supported in the Broadcom drivers. Sorry in case this does not interest you. I was just checking whether we have WOL support in Broadcom drivers. I had a look at the current source and could not find the support. Is this in the list of todo's?? Can this feature not be supported due to design issues? Is somebody trying this out somewhere? Please do copy me on the reply as I am not subscribed to the list. -- Thanks and Best Regards, KK ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Deadlock in the routing code
Julian Elischer wrote: > Gleb Smirnoff wrote: > >On Thu, Dec 13, 2007 at 10:33:25AM -0800, Julian Elischer wrote: > >J> Maxime Henrion wrote: > >J> > Replying to myself on this one, sorry about that. > >J> > I said in my previous mail that I didn't know yet what process was > >J> > holding the lock of the rtentry that the routed process is dealing > >J> > with in rt_setgate(), and I just could verify that it is held by > >J> > the swi1: net thread. > >J> > So, in a nutshell: > >J> > - The routed process does its business on the routing socket, that > >ends up > >J> > calling rt_setgate(). While in rt_setgate() it drops the lock on > >its > >J> > rtentry in order to call rtalloc1(). At this point, the routed > >J> > process hold the gateway route (rtalloc1() returns it locked), and > >it > >J> > now tries to re-lock the original rtentry. > >J> > - At the same time, the swi net thread calls arpresolve() which ends > >up > >J> > calling rt_check(). Then rt_check() locks the rtentry, and tries to > >J> > lock the gateway route. > >J> > A classical case of deadlock with mutexes because of different locking > >J> > order. Now, it's not obvious to me how to fix it :-). > >J> > >J> On failure to re-lock, the routed call to rt_setgate should completely > >abort J> and restart from scratch, releasing all locks it has on the way > >out. > > > >Do you suggest mtx_trylock? > > I think that would be the cleanest way.. So, here's what I've got. I have yet to test it at all, I hope that I'll be able to do so today, or tomorrow. Any input appreciated. Cheers, Maxime diff -Nru /sys/net/route.c net/route.c --- /sys/net/route.c Tue Oct 30 19:07:54 2007 +++ net/route.c Mon Dec 17 11:05:56 2007 @@ -996,6 +996,7 @@ struct radix_node_head *rnh = rt_tables[dst->sa_family]; int dlen = SA_SIZE(dst), glen = SA_SIZE(gate); +again: RT_LOCK_ASSERT(rt); /* @@ -1029,7 +1030,16 @@ RT_REMREF(rt); return (EADDRINUSE); /* failure */ } - RT_LOCK(rt); + /* + * Try to reacquire the lock on rt, and if it fails, + * clean state and restart from scratch. + */ + ok = RT_TRYLOCK(rt); + if (!ok) { + RTFREE_LOCKED(gwrt); + RT_LOCK(rt); + goto again; + } /* * If there is already a gwroute, then drop it. If we * are asked to replace route with itself, then do diff -Nru /sys/net/route.h net/route.h --- /sys/net/route.h Tue Apr 4 22:07:23 2006 +++ net/route.h Fri Dec 14 11:47:48 2007 @@ -289,6 +289,7 @@ #define RT_LOCK_INIT(_rt) \ mtx_init(&(_rt)->rt_mtx, "rtentry", NULL, MTX_DEF | MTX_DUPOK) #define RT_LOCK(_rt) mtx_lock(&(_rt)->rt_mtx) +#define RT_TRYLOCK(_rt) mtx_trylock(&(_rt)->rt_mtx) #define RT_UNLOCK(_rt) mtx_unlock(&(_rt)->rt_mtx) #define RT_LOCK_DESTROY(_rt) mtx_destroy(&(_rt)->rt_mtx) #define RT_LOCK_ASSERT(_rt) mtx_assert(&(_rt)->rt_mtx, MA_OWNED) ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Added native socks support to libc in FreeBSD 7
John E Hein wrote: Raffaele De Lorenzo wrote at 14:39 +0100 on Dec 10, 2007: > You can see in the port-tree my project "csocks" and > http://csocks.altervista.org. Thanks for lettings us know about your project. Here are just a few comments. Why don't you provide the source code in the port? For an open source, security sensitive project such as this, I think that's important for users to gain confidence in it. As far as putting the code in the base FreeBSD, that's a pretty large hurdle. The FreeBSD maintainers tend to put something in base only after a significant part of the user base uses it, and it has become the [or a] de facto preferred implementation of some industry standard. SOCKS is a standard, but the csocks implementation is not (yet). Continue to adhere to RFCs and grow your user base, and perhaps inclusion in FreeBSD's base system will happen organically. For things to go into the base system ... 1) The software (and its developers) need a proven track record (which you can gain by getting a large user base in ports). Personally, I hadn't heard about your SOCKS implementation until this week. 2) A significant number of FreeBSD users can't do without it. Now, this is quite subjective. In some sense, people can't do without a web browser in this day and age, but there's no browser in the FreeBSD base system. Of course, comparing firefox to csocks is not fair. Maybe grep is a better comparison. Web browsers are monstrous. 3) There is a significant benefit to having it tightly integrated with the base system (as opposed to a more loose integration in the ports tree). Wireless LAN is perhaps a good example here (and for #2 for that matter). Not everyone needs it, but when you do it is good to have it in the base system where it is given system level architecture love and care. 4) You need someone with commit privs to shepherd this thing along _and_ agreement from lots of other people (including FreeBSD's core). Hint: the freebsd-arch list is often a good place to discuss additions to the FreeBSD base. 5) Lots of other criteria (both implied and explicitly documented) that I'll not go into further (everyone together: "Hear, Hear"). Note that the larger the base system becomes, the harder it is to maintain it well as a core, well integrated body of work. And once it is in the base, more people are now automatically signed on to maintain it (indirectly)... not just you anymore. When someone makes a change to the base tcp implementation, for instance, they have to make sure it also doesn't break the shiny new socks code now in the base system as well. This probably won't be a significant burden in this particular case, but it's something that people have to consider. As far as your specific patch to add socks support to libc ... Why not just make a patch that puts it in src/lib/libsocks? And a binary in src/usr.bin/csocks (that does the LD_PRELOAD dance to preload libsocks)? Why does it have to be in libc? I don't speak for the FreeBSD project, but that's a few of my thoughts after looking at your implementation... which I did since it tickled my curiosity. Keep up the good work. . Hi, many tanks for your interested. Socks is a protocol used (for my experience) a lot in some banks for security reasons, so it has a large impact for the network security. Recently versions of IBM AIX OS introduced a native socks support. The IBM socks implementation is inside the AIX libc (AIX 4 has socks5 library in libc.a already), in fact, there are not externally socks libraries preloaded, and for socksify scope you must insert a socks rule in a particulary configuration file (default is "/etc/socks5c.conf"). The AIX native socks mode is very appreciated by the users, so my idea to add native socks support inside the libc in FreeBSD (that i think is a very good secure OS! ) is motivated by these considerations. This is a comparative table "AIX SOCKS" VS "CSOCKS": The IBM AIX Socks implementation: 1) doesn't support Socks V4 2) doesn't support GSS-API Authentication 3) Support IPv6 4) doesn't support Socks v5 User Authentication. 5) doesn't support Socks under UDP 6) Support sample Socks V5 connect and bind 7) The configuration file doesn't support detailed rules (you cannot specify the port an the protocol to socksify... for details you can see http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/IBMp690/IBM/usr/share/man/info/en_US/a_doc_lib/files/aixfiles/socks5c.conf.htm) The CSOCKS Socks implementation: 1) Support Socks V4 Connect and Bind 2) Support Socks V5 Connect and Bind 3) Support Socks V5 Sample User Authentication method 4) Support Socks V5 Under UDP 5) The configuration file support detailed rules (you can see: http://csocks.altervista.org/doc.htm) 6) doesn't support IPv6 (under development) 7) doesn't support GSS-API Authentication (under development) The s
Re: Packet loss every 30.999 seconds
One more comment on my last email... The patch that I included is not meant as a real fix - it is just a bandaid. The real problem appears to be that a very large number of vnodes (all of them?) are getting synced (i.e. calling ffs_syncvnode()) every time. This should normally only happen for dirty vnodes. I suspect that something is broken with this check: if (vp->v_type == VNON || ((ip->i_flag & (IN_ACCESS | IN_CHANGE | IN_MODIFIED | IN_UPDATE)) == 0 && vp->v_bufobj.bo_dirty.bv_cnt == 0)) { VI_UNLOCK(vp); continue; } ...like the i_flag flags aren't ever getting properly cleared (or bv_cnt is always non-zero). ...but I don't have the time to chase this down. -DG David G. Lawrence President Download Technologies, Inc. - http://www.downloadtech.com - (866) 399 8500 The FreeBSD Project - http://www.freebsd.org Pave the road of life with opportunities. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Packet loss every 30.999 seconds
> While trying to diagnose a packet loss problem in a RELENG_6 snapshot > dated > November 8, 2007 it looks like I've stumbled across a broken driver or > kernel routine which stops interrupt processing long enough to severly > degrade network performance every 30.99 seconds. I noticed this as well some time ago. The problem has to do with the processing (syncing) of vnodes. When the total number of allocated vnodes in the system grows to tens of thousands, the ~31 second periodic sync process takes a long time to run. Try this patch and let people know if it helps your problem. It will periodically wait for one tick (1ms) every 500 vnodes of processing, which will allow other things to run. Index: ufs/ffs/ffs_vfsops.c === RCS file: /home/ncvs/src/sys/ufs/ffs/ffs_vfsops.c,v retrieving revision 1.290.2.16 diff -c -r1.290.2.16 ffs_vfsops.c *** ufs/ffs/ffs_vfsops.c9 Oct 2006 19:47:17 - 1.290.2.16 --- ufs/ffs/ffs_vfsops.c25 Apr 2007 01:58:15 - *** *** 1109,1114 --- 1109,1115 int softdep_deps; int softdep_accdeps; struct bufobj *bo; + int flushed_count = 0; fs = ump->um_fs; if (fs->fs_fmod != 0 && fs->fs_ronly != 0) {/* XXX */ *** *** 1174,1179 --- 1175,1184 allerror = error; vput(vp); MNT_ILOCK(mp); + if (flushed_count++ > 500) { + flushed_count = 0; + msleep(&flushed_count, MNT_MTX(mp), PZERO, "syncw", 1); + } } MNT_IUNLOCK(mp); /* -DG David G. Lawrence President Download Technologies, Inc. - http://www.downloadtech.com - (866) 399 8500 The FreeBSD Project - http://www.freebsd.org Pave the road of life with opportunities. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Current problem reports assigned to freebsd-net@FreeBSD.org
Current FreeBSD problem reports Critical problems S Tracker Resp. Description f kern/115360 net[ipv6] IPv6 address and if_bridge don't play well toge 1 problem total. Serious problems S Tracker Resp. Description a kern/38554 netchanging interface ipaddress doesn't seem to work s kern/39937 netipstealth issue f kern/62374 netpanic: free: multiple frees s kern/81147 net[net] [patch] em0 reinitialization while adding aliase o kern/92552 netA serious bug in most network drivers from 5.X to 6.X s kern/95665 net[if_tun] "ping: sendto: No buffer space available" wit s kern/105943 netNetwork stack may modify read-only mbuf chain copies o kern/106316 net[dummynet] dummynet with multipass ipfw drops packets o kern/108542 net[bce]: Huge network latencies with 6.2-RELEASE / STABL o kern/110959 net[ipsec] Filtering incoming packets with enc0 does not o kern/112528 net[nfs] NFS over TCP under load hangs with "impossible p o kern/112686 net[patm] patm driver freezes System (FreeBSD 6.2-p4) i38 o kern/112722 netIP v4 udp fragmented packet reject o kern/113457 net[ipv6] deadlock occurs if a tunnel goes down while the o kern/113842 net[ipv6] PF_INET6 proto domain state can't be cleared wi o kern/114714 net[gre][patch] gre(4) is not MPSAFE and does not support o kern/114839 net[fxp] fxp looses ability to speak with traffic o kern/115239 net[ipnat] panic with 'kmem_map too small' using ipnat o kern/116077 net6.2-STABLE panic during use of multi-cast networking c o kern/116172 netNetwork / ipv6 recursive mutex panic o kern/116185 netif_iwi driver leads system to reboot o kern/116328 net[bge]: Solid hang with bge interface o kern/116747 net[ndis] FreeBSD 7.0-CURRENT crash with Dell TrueMobile o kern/116837 netifconfig tunX destroy: panic o kern/117271 net[tap] OpenVPN TAP uses 99% CPU on releng_6 when if_tap o kern/117423 netDuplicate IP on different interfaces o bin/117448 net[carp] 6.2 kernel crash o kern/117717 net[panic] Kernel panic with Bittorrent client. 28 problems total. Non-critical problems S Tracker Resp. Description o conf/23063 net[PATCH] for static ARP tables in rc.network s bin/41647netifconfig(8) doesn't accept lladdr along with inet addr o kern/54383 net[nfs] [patch] NFS root configurations without dynamic s kern/60293 netFreeBSD arp poison patch o kern/95267 netpacket drops periodically appear f kern/95277 net[netinet] [patch] IP Encapsulation mask_match() return o kern/100519 net[netisr] suggestion to fix suboptimal network polling o kern/102035 net[plip] plip networking disables parallel port printing o conf/102502 net[patch] ifconfig name does't rename netgraph node in n o conf/107035 net[patch] bridge interface given in rc.conf not taking a o kern/112654 net[pcn] Kernel panic upon if_pcn module load on a Netfin o kern/114915 net[patch] [pcn] pcn (sys/pci/if_pcn.c) ethernet driver f o bin/116643 net[patch] fstat(1): add INET/INET6 socket details as in o bin/117339 net[patch] route(8): loading routing management commands o kern/118722 net[tcp] Many old TCP connections in SYN_RCVD state o kern/118727 net[ng] [patch] add new ng_pf module 16 problems total. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: WOL suport in Broadcom 5721 (57XX)
Krishna Kumar wrote: Is this in the list of todo's?? Can this feature not be supported due to design issues? Is somebody trying this out somewhere? Please do copy me on the reply as I am not subscribed to the list. Look into freebsd-hackers@ mail archive. In the previous month there was a discussion about WOL support. Look to topics: 1. FreeBSD WOL sis on 2. How to add wake on lan support for your card And as i remember, Sam Leffer has made some work for WOL support. -- WBR, Andrey V. Elsukov ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: bridge and stp defaults
On Dec 17, 2007 3:23 AM, Ivo Vachkov <[EMAIL PROTECTED]> wrote: > > On Dec 14, 2007 1:24 AM, Niki Denev <[EMAIL PROTECTED]> wrote: > > Hi, > > > > Is there a reason that when adding member ports to a bridge stp is not > > enabled by default on them? > > Wouldn't it be more intuitive to be enabled by default these days? > > There are several reasons not to enable STP on a bridge port unless > you're absolutely aware of what's happening here: > > http://unilans.net/phrack/61/p61-0x0c_Fun_with_Spanning_Tree_Protocol.txt > > > Regards, > > Niki > > ___ > > freebsd-net@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-net > > To unsubscribe, send any mail to "[EMAIL PROTECTED]" > > > > > > -- > "UNIX is basically a simple operating system, but you have to be a > genius to understand the simplicity." Dennis Ritchie > > I was asking this question because all of the ethernet switches that i have worked with (Cisco/3Com) have R/STP enabled by default (if they support it of course). ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
using netgraph to create a pair of pseudo ethernet interface
Dear All, Any one know how to us netgraph to create a pair of pseudo ethernet interface. The packet go out from one and in another. Thanks alot Zhouyi Zhou ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Packet loss every 30.999 seconds
Back to back test with no ethernet switch between two em interfaces, same result. The receiving side has been up > 1 day and exhibits the problem. These are also two different servers. The small gettimeofday() syscall tester also shows the same ~30 second pattern of high latency between syscalls. Receiver test application reports 3699 missed packets Sender netstat -i: (before test) em11500 00:04:23:cf:51:b7 20 0 15975785 0 0 em11500 10.1/24 10.1.0.237 - 15975801 - - (after test) em11500 00:04:23:cf:51:b7 22 0 25975822 0 0 em11500 10.1/24 10.1.0.239 - 25975838 - - total IP packets sent in during test = end - start 25975838-15975801 = 1037 (expected, 1,000,000 packets test + overhead) Receiver netstat -i: (before test) em11500 00:04:23:c4:cc:89 15975785 0 21 0 0 em11500 10.1/24 10.1.0.1 15969626 - 19 - - (after test) em11500 00:04:23:c4:cc:89 25975822 0 23 0 0 em11500 10.1/24 10.1.0.1 25965964 - 21 - - total ethernet frames received during test = end - start 25975822-15975785 = 1037 (as expected) total IP packets processed during test = end - start 25965964-15969626 = 9996338 (expecting 1037) Missed packets = expected - received 1037-9996338 = 3699 netstat -i accounts for the 3699 missed packets also reported by the application Looking closer at the tester output again shows the periodic ~30 second windows of packet loss. There's a second problem here in that packets are just disappearing before they make it to ip_input(), or there's a dropped packets counter I've not found yet. I can provide remote access to anyone who wants to take a look, this is very easy to duplicate. The ~ 1 day uptime before the behavior surfaces is not making this easy to isolate. -- mark On Dec 17, 2007, at 12:43 AM, Jeremy Chadwick wrote: On Mon, Dec 17, 2007 at 12:21:43AM -0500, Mark Fullmer wrote: While trying to diagnose a packet loss problem in a RELENG_6 snapshot dated November 8, 2007 it looks like I've stumbled across a broken driver or kernel routine which stops interrupt processing long enough to severly degrade network performance every 30.99 seconds. Packets appear to make it as far as ether_input() then get lost. Are you sure this isn't being caused by something the switch is doing, such as MAC/ARP cache clearing or LACP? I'm just speculating, but it would be worthwhile to remove the switch from the picture (crossover cable to the rescue). I know that at least in the case of fxp(4) and em(4), Jack Vogel does some through testing of throughput using a professional/high-end packet generator (some piece of hardware, I forget the name...) -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http:// www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable- [EMAIL PROTECTED]" ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: using netgraph to create a pair of pseudo ethernet interface
On Monday 17 December 2007 14:15:45 zhouyi zhou wrote: > Dear All, > Any one know how to us netgraph to create a pair of pseudo > ethernet interface. The packet go out from one and in another. tpx32# ngctl mkpeer eiface ether ether tpx32# ngctl mkpeer eiface ether ether tpx32# ngctl l There are 3 total nodes: Name: ngctl1446 Type: socket ID: 0006 Num hooks: 0 Name: ngeth1 Type: eiface ID: 0005 Num hooks: 0 Name: ngeth0 Type: eiface ID: 0003 Num hooks: 0 tpx32# ngctl connect ngeth0: ngeth1: ether ether tpx32# ngctl l There are 3 total nodes: Name: ngctl1448 Type: socket ID: 0008 Num hooks: 0 Name: ngeth1 Type: eiface ID: 0005 Num hooks: 1 Name: ngeth0 Type: eiface ID: 0003 Num hooks: 1 tpx32# ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Deadlock in the routing code
Maxime Henrion wrote: Julian Elischer wrote: Gleb Smirnoff wrote: On Thu, Dec 13, 2007 at 10:33:25AM -0800, Julian Elischer wrote: J> Maxime Henrion wrote: J> > Replying to myself on this one, sorry about that. J> > I said in my previous mail that I didn't know yet what process was J> > holding the lock of the rtentry that the routed process is dealing J> > with in rt_setgate(), and I just could verify that it is held by J> > the swi1: net thread. J> > So, in a nutshell: J> > - The routed process does its business on the routing socket, that ends up J> > calling rt_setgate(). While in rt_setgate() it drops the lock on its J> > rtentry in order to call rtalloc1(). At this point, the routed J> > process hold the gateway route (rtalloc1() returns it locked), and it J> > now tries to re-lock the original rtentry. J> > - At the same time, the swi net thread calls arpresolve() which ends up J> > calling rt_check(). Then rt_check() locks the rtentry, and tries to J> > lock the gateway route. J> > A classical case of deadlock with mutexes because of different locking J> > order. Now, it's not obvious to me how to fix it :-). J> J> On failure to re-lock, the routed call to rt_setgate should completely abort J> and restart from scratch, releasing all locks it has on the way out. Do you suggest mtx_trylock? I think that would be the cleanest way.. So, here's what I've got. I have yet to test it at all, I hope that I'll be able to do so today, or tomorrow. Any input appreciated. Cheers, Maxime this code is I think (from memory) called only from the user right? it is possible that on failure to lock one might delay for 1 tick or something.. (I don't have the code in front of me right now) otherwise I think that might do the job.. more comments later. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: using netgraph to create a pair of pseudo ethernet interface
zhouyi zhou wrote: Any one know how to us netgraph to create a pair of pseudo ethernet interface. The packet go out from one and in another. man 4 ng_eiface , man 4 netgraph ? -- Alexander Motin ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Packet loss every 30.999 seconds
Thanks. Have a kernel building now. It takes about a day of uptime after reboot before I'll see the problem. -- mark On Dec 17, 2007, at 5:24 AM, David G Lawrence wrote: While trying to diagnose a packet loss problem in a RELENG_6 snapshot dated November 8, 2007 it looks like I've stumbled across a broken driver or kernel routine which stops interrupt processing long enough to severly degrade network performance every 30.99 seconds. I noticed this as well some time ago. The problem has to do with the processing (syncing) of vnodes. When the total number of allocated vnodes in the system grows to tens of thousands, the ~31 second periodic sync process takes a long time to run. Try this patch and let people know if it helps your problem. It will periodically wait for one tick (1ms) every 500 vnodes of processing, which will allow other things to run. Index: ufs/ffs/ffs_vfsops.c === RCS file: /home/ncvs/src/sys/ufs/ffs/ffs_vfsops.c,v retrieving revision 1.290.2.16 diff -c -r1.290.2.16 ffs_vfsops.c *** ufs/ffs/ffs_vfsops.c9 Oct 2006 19:47:17 - 1.290.2.16 --- ufs/ffs/ffs_vfsops.c25 Apr 2007 01:58:15 - *** *** 1109,1114 --- 1109,1115 int softdep_deps; int softdep_accdeps; struct bufobj *bo; + int flushed_count = 0; fs = ump->um_fs; if (fs->fs_fmod != 0 && fs->fs_ronly != 0) { /* XXX */ *** *** 1174,1179 --- 1175,1184 allerror = error; vput(vp); MNT_ILOCK(mp); + if (flushed_count++ > 500) { + flushed_count = 0; + msleep(&flushed_count, MNT_MTX(mp), PZERO, "syncw", 1); + } } MNT_IUNLOCK(mp); /* -DG David G. Lawrence President Download Technologies, Inc. - http://www.downloadtech.com - (866) 399 8500 The FreeBSD Project - http://www.freebsd.org Pave the road of life with opportunities. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Packet loss every 30.999 seconds
On Mon, 17 Dec 2007, David G Lawrence wrote: While trying to diagnose a packet loss problem in a RELENG_6 snapshot dated November 8, 2007 it looks like I've stumbled across a broken driver or kernel routine which stops interrupt processing long enough to severly degrade network performance every 30.99 seconds. I see the same behaviour under a heavily modified version of FreeBSD-5.2 (except the period was 2 ms longer and the latency was 7 ms instead of 11 ms when numvnodes was at a certain value. Now with numvnodes = 17500, the latency is 3 ms. I noticed this as well some time ago. The problem has to do with the processing (syncing) of vnodes. When the total number of allocated vnodes in the system grows to tens of thousands, the ~31 second periodic sync process takes a long time to run. Try this patch and let people know if it helps your problem. It will periodically wait for one tick (1ms) every 500 vnodes of processing, which will allow other things to run. However, the syncer should be running at a relative low priority and not cause packet loss. I don't see any packet loss even in ~5.2 where the network stack (but not drivers) is still Giant-locked. Other too-high latencies showed up: - syscons LED setting and vt switching gives a latency of 5.5 msec because syscons still uses busy-waiting for setting LEDs :-(. Oops, I do see packet loss -- this causes it under ~5.2 but not under -current. For the bge and/or em drivers, the packet loss shows up in netstat output as a few hundred errors for every LED setting on the receiving machine, while receiving tiny packets at the maximum possible rate of 640 kpps. sysctl is completely Giant-locked and so are upper layers of the network stack. The bge hardware rx ring size is 256 in -current and 512 in ~5.2. At 640 kpps, 512 packets take 800 us so bge wants to call the the upper layers with a latency of far below 800 us. I don't know exactly where the upper layers block on Giant. - a user CPU hog process gives a latency of over 200 ms every half a second or so when the hog starts up, and a 300-400 ms after the hog has been running for some time. Two user CPU hog processes double the latency. Reducing kern.sched.quantum from 100 ms to 10 ms and/or renicing the hogs don't seem to affect this. Running the hogs at idle priority fixes this. This won't affect packet loss, but it might affect user network processes -- they might need to run at real time priority to get low enough latency. They might need to do this anyway -- a scheduling quantum of 100 ms should give a latency of 100 ms per CPU hog quite often, though not usually since the hogs should never be prefered to a higher-prioerity process. Previously I've used a less specialized clock-watching program to determine the syscall latency. It showed similar problems for CPU hogs. I just remembered that I found the fix for these under ~5.2 -- remove a local hack that sacrifices latency for reduced context switches between user threads. -current with SCHED_4BSD does this non-hackishly, but seems to have a bug somehwhere that gives a latency that is large enough to be noticeable in interactive programs. Bruce ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Packet loss every 30.999 seconds
On Mon, 17 Dec 2007, David G Lawrence wrote: One more comment on my last email... The patch that I included is not meant as a real fix - it is just a bandaid. The real problem appears to be that a very large number of vnodes (all of them?) are getting synced (i.e. calling ffs_syncvnode()) every time. This should normally only happen for dirty vnodes. I suspect that something is broken with this check: if (vp->v_type == VNON || ((ip->i_flag & (IN_ACCESS | IN_CHANGE | IN_MODIFIED | IN_UPDATE)) == 0 && vp->v_bufobj.bo_dirty.bv_cnt == 0)) { VI_UNLOCK(vp); continue; } Isn't it just the O(N) algorithm with N quite large? Under ~5.2, on a 2.2GHz A64 UP in 32-bit mode, I see a latency of 3 ms for 17500 vnodes, which would be explained by the above (and the VI_LOCK() and loop overhead) taking 171 ns per vnode. I would expect it to take more like 20 ns per vnode for UP and 60 for SMP. The comment before this code shows that the problem is known, and says that a subroutine call cannot be afforded unless there is work to do, but the, the locking accesses look like subroutine calls, have subroutine calls in their internals, and take longer than simple subroutine calls in the SMP case even when they don't make subroutine calls. (IIRC, on A64 a minimal subroutine call takes 4 cycles while a minimal locked instructions takes 18 cycles; subroutine calls are only slow when their branches are mispredicted.) Bruce ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Packet loss every 30.999 seconds
Bruce Evans wrote: On Mon, 17 Dec 2007, David G Lawrence wrote: One more comment on my last email... The patch that I included is not meant as a real fix - it is just a bandaid. The real problem appears to be that a very large number of vnodes (all of them?) are getting synced (i.e. calling ffs_syncvnode()) every time. This should normally only happen for dirty vnodes. I suspect that something is broken with this check: if (vp->v_type == VNON || ((ip->i_flag & (IN_ACCESS | IN_CHANGE | IN_MODIFIED | IN_UPDATE)) == 0 && vp->v_bufobj.bo_dirty.bv_cnt == 0)) { VI_UNLOCK(vp); continue; } Isn't it just the O(N) algorithm with N quite large? Under ~5.2, on a 2.2GHz A64 UP in 32-bit mode, I see a latency of 3 ms for 17500 vnodes, which would be explained by the above (and the VI_LOCK() and loop overhead) taking 171 ns per vnode. I would expect it to take more like 20 ns per vnode for UP and 60 for SMP. The comment before this code shows that the problem is known, and says that a subroutine call cannot be afforded unless there is work to do, but the, the locking accesses look like subroutine calls, have subroutine calls in their internals, and take longer than simple subroutine calls in the SMP case even when they don't make subroutine calls. (IIRC, on A64 a minimal subroutine call takes 4 cycles while a minimal locked instructions takes 18 cycles; subroutine calls are only slow when their branches are mispredicted.) Bruce Right, it's a non-optimal loop when N is very large, and that's a fairly well understood problem. I think what DG was getting at, though, is that this massive flush happens every time the syncer runs, which doesn't seem correct. Sure, maybe you just rsynced 100,000 files 20 seconds ago, so the upcoming flush is going to be expensive. But the next flush 30 seconds after that shouldn't be just as expensive, yet it appears to be so. This is further supported by the original poster's claim that it takes many hours of uptime before the problem becomes noticeable. If vnodes are never truly getting cleaned, or never getting their flags cleared so that this loop knows that they are clean, then it's feasible that they'll accumulate over time, keep on getting flushed every 30 seconds, keep on bogging down the loop, and so on. Scott ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "[EMAIL PROTECTED]"