I have some more tests, but always the same.
Here is the top output, that i see last on terminal:
65 processes: 4 running, 144 sleeping, 17 waiting
CPU states: 0.0% user, 0.0% nice, 29.9% system, 7.8% interrupt, 62.3%
idle
Mem: 46M Active, 522M Inact, 350M Wired, 10M Cache, 109M Buf, 35M Free
Swap: 2048M Total, 228K Used, 2048M Free
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
12 root 1 171 ki31 0K 16K RUN 0 692:10 61.96% idle: cpu0
11 root 1 171 ki31 0K 16K RUN 1 804:37 51.51% idle: cpu1
13 root 1 -44 - 0K 16K WAIT 0 325:41 13.18% swi1: net
42 root 1 -68 - 0K 16K - 0 12:26 12.74% dummynet
72368 ftp 1 4 0 10832K 2080K RUN 0 4:05 10.01% vsftpd
74488 ftp 1 4 0 10832K 2080K zfs:(& 0 0:31 9.62% vsftpd
73690 ftp 1 8 0 10832K 2072K nanslp 0 1:43 8.64% vsftpd
74752 ftp 1 8 0 10832K 2068K nanslp 1 0:09 6.06% vsftpd
74811 ftp 1 8 0 10832K 2072K nanslp 1 0:03 3.79% vsftpd
74869 ftp 1 4 0 10832K 2056K sbwait 1 0:00 3.39% vsftpd
As you can see, there is no high load, enought free memory and CPUs are
relative idle.
This is the output from nload -o 250000:
Device em0 [10.252.2.3] (1/1):
============================================================================================================================================
Incoming:
.. .| .|. . .. . ||...|.... .........|
. . .. Curr: 2519.37 kBit/s
####|||######.###################### ############... .....
..||... .|. ##|#|####. Avg: 2831.76 kBit/s
####################################|# .. |. |
######################|#######||############## Min: 141.74 kBit/s
#########################################.##.#|##############################################
Max: 6586.57 kBit/s
#############################################################################################
Ttl: 542932224.00 MByte
Outgoing:
.
.. # .. . . ##.. .... . .
# . # .#
## ## ### # .##.|| #########||
|###|#####.. |#.#.#|##
####||.######.###################### ############... | .
..# .| |. #########.
#################################### ###############.####|#
####|## ###|##########
####################################.# . .
##############################||##############
###################################### || ## #
############################################## Curr: 127802.60 kBit/s
######################################|## ##
#|############################################## Avg: 144300.05 kBit/s
#########################################|###################################################
Min: 3339.92 kBit/s
#############################################################################################
Max: 292397.75 kBit/s
#############################################################################################
Ttl: 14977425408.00 MByte
There is no rapid change of bw or anything.
Kris Kennaway wrote:
Adding back the mailing list so others can help.
matthew wrote:
Kris Kennaway wrote:
matthew wrote:
I have posted before that i have a stability issue with the 7.0 branch
on my servers. Tested on BETA2,BETA4,RC1,RC2,RELEASE
The original thread and my post with details is at:
http://unix.derkeiler.com/Mailing-Lists/FreeBSD/current/2007-12/msg00674.html
I was waiting for the 7.0-RELEASE, updated the whole servers, and
enabled dummynet again, but it always freezes after some minutes, 100%
reproducible.
I tested it also on a HP 140 G3 1U server, where 6.3 has absolutely no
problems, but the 7.0 branch keeps freezing.
Again, if it helps to solve this bug, i can rebuild the kernel with
debug symbols a take some screenshots :)
Please follow the steps at
http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug.html
Kris
_______________________________________________
I added to the kernel these options:
makeoptions DEBUG=-g # Build kernel with gdb(1)
debug symbols
options INVARIANTS
options INVARIANT_SUPPORT
options WITNESS
options DEBUG_LOCKS
options DEBUG_VFS_LOCKS
options DIAGNOSTIC
options KDB
options DDB
options GDB
options SOCKBUF_DEBUG
But still, the server freezes withoutany debug message,
And you can't break to DDB? It is expected you will have to when you
encounter a deadlock.
No, i am unable to do anything, no input from keyboard, the only one
thing thats work, is to change the num lock indicator on the keyboard :)
> panic whatever,
but at leasts it dumps the crash kernel to /var/crash, but the debug
looks like this:
Dumping is an activity the occurs after panicking, so if you have a
dump then you must have a panic.
[EMAIL PROTECTED]:/usr/src/sys/amd64/compile/HANKA-debug# kgdb kernel.debug
/var/crash/vmcore.0
[GDB will not be able to debug user-mode threads:
/usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and
you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for
details.
This GDB was configured as "amd64-marcel-freebsd".
Cannot access memory at address 0x57
(kgdb) backtrace
#0 0x0000000000000000 in ?? ()
Cannot access memory at address 0x0
(kgdb) I have some debug messages
at boot time:
This usually means you are not running the dump against the
kernel.debug corresponding exactly to the kernel that crashed.
Probably the dump you found was old.
Yes, i think it is possible that it is the previous kernel, without the
debug options.
Feb 28 17:01:29 hanka lock order reversal:
Feb 28 17:01:29 hanka 1st 0xffffffff80b0e1e8 PFil hook read/write
mutex (PFil hook read/write mutex) @ net/pfil.c:73
Feb 28 17:01:29 hanka 2nd 0xffffffff80b0f8e8 udp (udp) @
/usr/src/sys/modules/pf/../../contrib/pf/net/pf.c:2970
Feb 28 17:01:29 hanka KDB: stack backtrace:
Feb 28 17:01:29 hanka db_trace_self_wrapper() at
db_trace_self_wrapper+0x2a
Feb 28 17:01:29 hanka witness_checkorder() at witness_checkorder+0x605
Feb 28 17:01:29 hanka _mtx_lock_flags() at _mtx_lock_flags+0x75
Feb 28 17:01:29 hanka pf_socket_lookup() at pf_socket_lookup+0x241
Feb 28 17:01:29 hanka pf_test_udp() at pf_test_udp+0x890
Feb 28 17:01:29 hanka pf_test() at pf_test+0xeb2
Feb 28 17:01:29 hanka pf_check_in() at pf_check_in+0x2b
Feb 28 17:01:29 hanka pfil_run_hooks() at pfil_run_hooks+0xbc
Feb 28 17:01:29 hanka ip_input() at ip_input+0x2c4
Feb 28 17:01:29 hanka ether_demux() at ether_demux+0x1d9
Feb 28 17:01:29 hanka ether_input() at ether_input+0x19d
Feb 28 17:01:29 hanka ether_demux() at ether_demux+0xe9
Feb 28 17:01:29 hanka ether_input() at ether_input+0x19d
Feb 28 17:01:29 hanka em_rxeof() at em_rxeof+0x1ca
Feb 28 17:01:29 hanka em_poll() at em_poll+0x6f
Feb 28 17:01:29 hanka netisr_poll() at netisr_poll+0x87
Feb 28 17:01:29 hanka swi_net() at swi_net+0xea
Feb 28 17:01:29 hanka ithread_loop() at ithread_loop+0xda
Feb 28 17:01:29 hanka fork_exit() at fork_exit+0x12a
Feb 28 17:01:29 hanka fork_trampoline() at fork_trampoline+0xe
Feb 28 17:01:29 hanka --- trap 0, rip = 0, rsp = 0xffffffff9feead30,
rbp = 0 ---
Feb 28 17:01:29 hanka lock order reversal:
Feb 28 17:01:29 hanka 1st 0xffffff0001237430 em0 (EM Core Lock) @
dev/em/if_em.c:1416
Feb 28 17:01:29 hanka 2nd 0xffffffff80b0f8e8 udp (udp) @
netinet/udp_usrreq.c:385
Feb 28 17:01:29 hanka KDB: stack backtrace:
Feb 28 17:01:29 hanka db_trace_self_wrapper() at
db_trace_self_wrapper+0x2a
Feb 28 17:01:29 hanka witness_checkorder() at witness_checkorder+0x605
Feb 28 17:01:29 hanka _mtx_lock_flags() at _mtx_lock_flags+0x75
Feb 28 17:01:29 hanka udp_input() at udp_input+0x1a4
Feb 28 17:01:29 hanka ip_input() at ip_input+0xc0
Feb 28 17:01:29 hanka ether_demux() at ether_demux+0x1d9
Feb 28 17:01:29 hanka ether_input() at ether_input+0x19d
Feb 28 17:01:29 hanka ether_demux() at ether_demux+0xe9
Feb 28 17:01:29 hanka ether_input() at ether_input+0x19d
Feb 28 17:01:29 hanka em_rxeof() at em_rxeof+0x1ca
Feb 28 17:01:29 hanka em_poll() at em_poll+0x6f
Feb 28 17:01:29 hanka netisr_poll() at netisr_poll+0x87
Feb 28 17:01:29 hanka swi_net() at swi_net+0xea
Feb 28 17:01:29 hanka ithread_loop() at ithread_loop+0xda
Feb 28 17:01:29 hanka fork_exit() at fork_exit+0x12a
Feb 28 17:01:29 hanka fork_trampoline() at fork_trampoline+0xe
Feb 28 17:01:29 hanka --- trap 0, rip = 0, rsp = 0xffffffff9feead30,
rbp = 0 ---
Feb 28 17:01:29 hanka xinetd[1356]: xinetd Version 2.3.14 started
with libwrap loadavg options compiled in.
Feb 28 17:01:29 hanka xinetd[1356]: Started working: 1 available service
Feb 28 17:01:33 hanka lock order reversal:
Feb 28 17:01:33 hanka 1st 0xffffff00042f5b10 inp (tcpinp) @
netinet/tcp_usrreq.c:781
Feb 28 17:01:33 hanka 2nd 0xffffffff80b0e1e8 PFil hook read/write
mutex (PFil hook read/write mutex) @ net/pfil.c:73
Feb 28 17:01:33 hanka KDB: stack backtrace:
Feb 28 17:01:33 hanka db_trace_self_wrapper() at
db_trace_self_wrapper+0x2a
Feb 28 17:01:33 hanka witness_checkorder() at witness_checkorder+0x605
Feb 28 17:01:33 hanka _rw_rlock() at _rw_rlock+0x5b
Feb 28 17:01:33 hanka pfil_run_hooks() at pfil_run_hooks+0x44
Feb 28 17:01:33 hanka ip_output() at ip_output+0x38e
Feb 28 17:01:33 hanka tcp_output() at tcp_output+0xacd
Feb 28 17:01:33 hanka tcp_usr_send() at tcp_usr_send+0x272
Feb 28 17:01:33 hanka sosend_generic() at sosend_generic+0x347
Feb 28 17:01:33 hanka soo_write() at soo_write+0x38
Feb 28 17:01:33 hanka dofilewrite() at dofilewrite+0x85
Feb 28 17:01:33 hanka kern_writev() at
Feb 28 17:01:33 hanka kern_writev+0x4c
Feb 28 17:01:33 hanka write() at write+0x54
Feb 28 17:01:33 hanka syscall() at syscall+0x1f6
Feb 28 17:01:33 hanka Xfast_syscall() at Xfast_syscall+0xab
Feb 28 17:01:33 hanka --- syscall (4, FreeBSD ELF64, write), rip =
0x800e1fd3c, rsp = 0x7fffffffea88, rbp = 0x16 ---
Feb 28 17:03:54 hanka lock order reversal:
Feb 28 17:03:54 hanka 1st 0xffffffff80b0f308 tcp (tcp) @
netinet/tcp_input.c:400
Feb 28 17:03:54 hanka 2nd 0xffffffff80b0e1e8 PFil hook read/write
mutex (PFil hook read/write mutex) @ net/pfil.c:73
Feb 28 17:03:54 hanka KDB: stack backtrace:
Feb 28 17:03:54 hanka db_trace_self_wrapper() at
db_trace_self_wrapper+0x2a
Feb 28 17:03:54 hanka witness_checkorder() at witness_checkorder+0x605
Feb 28 17:03:54 hanka _rw_rlock() at _rw_rlock+0x5b
Feb 28 17:03:54 hanka pfil_run_hooks() at pfil_run_hooks+0x44
Feb 28 17:03:54 hanka ip_output() at ip_output+0x38e
Feb 28 17:03:54 hanka tcp_respond() at tcp_respond+0x2d5
Feb 28 17:03:54 hanka tcp_dropwithreset() at tcp_dropwithreset+0x131
Feb 28 17:03:54 hanka tcp_input() at tcp_input+0x6d0
Feb 28 17:03:54 hanka ip_input() at ip_input+0xc0
Feb 28 17:03:54 hanka ether_demux() at ether_demux+0x1d9
Feb 28 17:03:54 hanka ether_input() at ether_input+0x19d
Feb 28 17:03:54 hanka em_rxeof() at em_rxeof+0x1ca
Feb 28 17:03:54 hanka em_poll() at em_poll+0x6f
Feb 28 17:03:54 hanka netisr_poll() at netisr_poll+0x87
Feb 28 17:03:54 hanka swi_net() at swi_net+0xea
Feb 28 17:03:54 hanka ithread_loop() at ithread_loop+0xda
Feb 28 17:03:54 hanka fork_exit() at fork_exit+0x12a
Feb 28 17:03:54 hanka fork_trampoline() at fork_trampoline+0xe
Feb 28 17:03:54 hanka --- trap 0, rip = 0, rsp = 0xffffffff9feead30,
rbp = 0 ---
Feb 28 17:23:36 hanka lock order reversal:
Feb 28 17:23:36 hanka 1st 0xffffffff810f9ef0 tcp_sc_head
(tcp_sc_head) @ netinet/tcp_syncache.c:477
Feb 28 17:23:36 hanka 2nd 0xffffffff80b0e1e8 PFil hook read/write
mutex (PFil hook read/write mutex) @ net/pfil.c:73
Feb 28 17:23:36 hanka KDB: stack backtrace:
Feb 28 17:23:36 hanka db_trace_self_wrapper() at
db_trace_self_wrapper+0x2a
Feb 28 17:23:36 hanka witness_checkorder() at witness_checkorder+0x605
Feb 28 17:23:36 hanka _rw_rlock() at _rw_rlock+0x5b
Feb 28 17:23:36 hanka pfil_run_hooks() at pfil_run_hooks+0x44
Feb 28 17:23:36 hanka ip_output() at ip_output+0x38e
Feb 28 17:23:36 hanka syncache_respond() at syncache_respond+0x32a
Feb 28 17:23:36 hanka syncache_add() at syncache_add+0x206
Feb 28 17:23:36 hanka tcp_input() at tcp_input+0xa77
Feb 28 17:23:36 hanka ip_input() at ip_input+0xc0
Feb 28 17:23:36 hanka ether_demux() at ether_demux+0x1d9
Feb 28 17:23:36 hanka ether_input() at ether_input+0x19d
Feb 28 17:23:36 hanka em_rxeof() at em_rxeof+0x1ca
Feb 28 17:23:36 hanka em_poll() at em_poll+0x6f
Feb 28 17:23:36 hanka netisr_poll() at netisr_poll+0x87
Feb 28 17:23:36 hanka swi_net() at swi_net+0xea
Feb 28 17:23:36 hanka ithread_loop() at ithread_loop+0xda
Feb 28 17:23:36 hanka fork_exit() at fork_exit+0x12a
Feb 28 17:23:36 hanka fork_trampoline() at fork_trampoline+0xe
Feb 28 17:23:36 hanka --- trap 0, rip = 0, rsp = 0xffffffff9feead30,
rbp = 0 ---
These look useful and should point to the problem. It is also
probably related to something in your script:
I tried it with pf disabled, only running ipfw without dummynet,
everything is ok.
I tried it witout pf disabled, running ipfw with dummynet queue for
small traffic, everything is ok.
I tried it with/without pf disabled, running ipfw with dummynet queue
for hight traffic >100Mbit/s, the box freezes, everytime, after
enabling rules with the dummynet queue, after some second or few
minutes, maybe depending on the high traffic.
The shaper script:
# SHAPER
IPFW_PIPE_DOWNLOAD=1
IPFW_PIPE_UPLOAD=11
IPFW_QUEUE_DOWNLOAD=1
IPFW_QUEUE_UPLOAD=11
DOWNLOAD_ROOT="450Mbit/s"
UPLOAD_ROOT="450Mbit/s"
SHAPER_BUCKETS="1024"
IPFW_SHAPER_ROOT=1000
$CMD pipe $IPFW_PIPE_DOWNLOAD config bw $DOWNLOAD_ROOT buckets
$SHAPER_BUCKETS
$CMD pipe $IPFW_PIPE_UPLOAD config bw $UPLOAD_ROOT buckets
$SHAPER_BUCKETS
$CMD queue $IPFW_QUEUE_DOWNLOAD config pipe $IPFW_PIPE_DOWNLOAD
buckets $SHAPER_BUCKETS mask dst-ip 0xFFFFFFFF
$CMD queue $IPFW_QUEUE_UPLOAD config pipe $IPFW_PIPE_UPLOAD buckets
$SHAPER_BUCKETS mask src-ip 0xFFFFFFFF
$CMD delete $IPFW_SHAPER_ROOT
$CMD $IPFW_SHAPER_ROOT add queue $IPFW_QUEUE_DOWNLOAD ip from me to
any out // Share download
$CMD $IPFW_SHAPER_ROOT add queue $IPFW_QUEUE_UPLOAD ip from any to me
in // Share upload
Thanks for the information.
Kris
I hope i can help to solve this problem.
Matthew
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"