Greetings, Thanks to everybody for their quick responces before. (I've also had another crack at my TMDA filter so hopefully my reply address will work this time). Last time I forgot to mention I was pulling the datafiles from a compaq raid system (ciss0: <HP Smart Array 6i>). I had a large number of files with random content, so there was lots of waiting for disk. I've now setup MFS with not as many files. This seemed to bring back network stability. I also adjusted the TCP windows (net.inet.tcp.sendspace=65536, net.inet.tcp.recvspace=65536), but once on the MFS I found no change moving to the bigger window sizes (net.inet.tcp.sendspace=1024000, net.inet.tcp.recvspace=1024000). I've found that the polling settings all seem to be for 100MB/s not Gig, so I've edited /usr/src/sys/kern/kern_poll.c and increased the #define statements by at least 10:
Before:
#define MIN_POLL_BURST_MAX      10
#define MAX_POLL_BURST_MAX      1000
After:
#define MIN_POLL_BURST_MAX      1000
#define MAX_POLL_BURST_MAX 10000
Then set /etc/sysctl.conf to
--------------------
kern.polling.burst=5000
kern.polling.each_burst=1000
kern.polling.burst_max=8000
-------------------- Performance improved lots, although I was still seeing the "kern.polling.short_ticks" increasing rapidly. The /usr/src/sys/kern/kern_poll.c mentions that this means the poll rate is to high, so I dropped the HZ back to 10000 from 15000, and the problem has gone away. The server under siege is now stable with 60 concurrnet sessions, when before it could not handle this. The processes also seem to be in "accept" rather than "lockf".
--------------------
last pid: 3469; load averages: 1.79, 1.70, 1.47 up 0+00:28:09 05:59:46
191 processes: 8 running, 183 sleeping
CPU states: 2.0% user, 0.0% nice, 32.6% system, 48.0% interrupt, 17.4% idle
Mem: 34M Active, 7180K Inact, 87M Wired, 29M Buf, 869M Free
Swap: 2023M Total, 2023M Free
PID USERNAME PRI NICE   SIZE    RES STATE  C   TIME   WCPU    CPU COMMAND
616 www        4    0  3420K  2152K sbwait 1   0:07  0.39%  0.39% httpd
3305 www        4    0  3432K  2160K accept 1   0:07  0.34%  0.34% httpd
690 www        4    0  3420K  2152K accept 1   0:06  0.34%  0.34% httpd
664 www        4    0  3436K  2172K accept 1   0:06  0.29%  0.29% httpd
633 www        4    0  3436K  2172K accept 1   0:06  0.29%  0.29% httpd
651 www        4    0  3436K  2172K RUN    1   0:06  0.24%  0.24% httpd
3390 www        4    0  3432K  2160K accept 0   0:05  0.24%  0.24% httpd
612 www        4    0  3436K  2172K accept 1   0:07  0.20%  0.20% httpd
631 www        4    0  3436K  2172K accept 1   0:07  0.20%  0.20% httpd
621 www        4    0  3436K  2172K accept 1   0:06  0.15%  0.15% httpd
697 www        4    0  3436K  2172K RUN    1   0:06  0.15%  0.15% httpd
3380 www        4    0  3432K  2160K sbwait 1   0:06  0.15%  0.15% httpd
3392 www        4    0  3432K  2160K accept 1   0:05  0.15%  0.15% httpd
3397 www        4    0  3432K  2160K RUN    1   0:05  0.15%  0.15% httpd
3376 www        4    0  3432K  2160K accept 1   0:05  0.15%  0.15% httpd
3383 www        4    0  3432K  2160K accept 1   0:05  0.15%  0.15% httpd
3315 www        4    0  3432K  2160K accept 0   0:07  0.10%  0.10% httpd
3309 www        4    0  3432K  2160K sbwait 1   0:07  0.10%  0.10% httpd
-------------------- This is another server under siege the same configuration, but without the POLL_BURST_MAX tweaks and HZ=15000.
--------------------
last pid: 24068; load averages: 13.54, 5.40, 4.63 up 0+02:59:04 17:19:11
233 processes: 4 running, 228 sleeping, 1 zombie
CPU states: 3.8% user, 0.0% nice, 31.8% system, 47.3% interrupt, 17.0% idle
Mem: 46M Active, 8396K Inact, 105M Wired, 48K Cache, 33M Buf, 838M Free
Swap: 2023M Total, 2023M Free
PID USERNAME PRI NICE   SIZE    RES STATE  C   TIME   WCPU    CPU COMMAND
4508 www        4    0  5040K  3256K sbwait 1   0:37  0.54%  0.54% httpd
4497 www        4    0  5040K  3256K sbwait 1   0:34  0.34%  0.34% httpd
4539 www        4    0  5040K  3256K sbwait 1   0:36  0.29%  0.29% httpd
4521 www       20    0  5040K  3256K lockf  1   0:34  0.29%  0.29% httpd
626 www        4    0  5040K  3252K sbwait 1   0:36  0.24%  0.24% httpd
4896 www       20    0  5040K  3256K lockf  1   0:35  0.24%  0.24% httpd
4522 www        4    0  5040K  3256K sbwait 0   0:34  0.24%  0.24% httpd
629 www       20    0  5040K  3252K lockf  1   0:35  0.20%  0.20% httpd
601 www        4    0  5040K  3252K sbwait 1   0:33  0.20%  0.20% httpd
600 www       20    0  5040K  3252K lockf  1   0:35  0.15%  0.15% httpd
674 www       20    0  5040K  3252K lockf  1   0:34  0.15%  0.15% httpd
4787 www        4    0  5040K  3256K sbwait 1   0:34  0.15%  0.15% httpd
669 www       20    0  5040K  3252K lockf  1   0:34  0.15%  0.15% httpd
4509 www       20    0  5040K  3256K lockf  1   0:32  0.15%  0.15% httpd
4486 www       20    0  5040K  3256K lockf  1   0:36  0.10%  0.10% httpd
4906 www       20    0  5040K  3256K lockf  1   0:36  0.10%  0.10% httpd
4542 www       20    0  5040K  3256K lockf  1   0:36  0.10%  0.10% httpd
607 www        4    0  5040K  3252K sbwait 1   0:35  0.10%  0.10% httpd
4510 www        4    0  5040K  3272K sbwait 1   0:35  0.10%  0.10% httpd
--------------------

On both system the kern.polling.lost_polls is still increasing rapidly. I'm not sure what to do about this. ??
--------------------
kern.polling.lost_polls: 9605569
--------------------
Also the kern.polling.suspect is increasing similarly. I'm not sure what to do about this either. ??
------------------
kern.polling.suspect: 608527
------------------

Also thanks for the info on the VLAN searching. I think the adjustment you suggested sounds good, but at bit out of my league. It seems there are plent of things to tweak in the kernel still. BTW, I'd be interested to know people's thoughts on multiple IP stacks on FreeBSD. It would be really cool to be able to give a jail it's own IP stack bound to a VLAN interface. It could then be like a VRF on Cisco.
Regards,
Dave Seddon
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to