Re: System doesn't resume with active vbox VM

2013-03-19 Thread David Demelier
If the system reboots then you have a panic. Can you add dumpdev="AUTO" in
your /etc/rc.conf? And then please give us a kernel panic backtrace :

http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug-gdb.html


2013/3/18 Dominic Fandrey 

> My system doesn't resume with an active VirtualBox VM, in this case
> Windows XP/32. I didn't test any other systems.
>
> The system comes back to the console screen, but doesn't get back
> into X. After a couple of seconds (no dump occurs) I get the BIOS
> screen and the system reboots with unclean file systems.
>
> # uname -a
> FreeBSD mobileKamikaze.norad 9.1-STABLE FreeBSD 9.1-STABLE #3 r247136: Fri
> Feb 22 00:52:22 CET 2013 
> root@mobileKamikaze.norad:/usr/obj/HP6510b-9/amd64/usr/src/sys/HP6510b-9
>  amd64
>
> Hardware virtualization is turned on, the additions are installed
> in the VM.
>
> --
> A: Because it fouls the order in which people normally read text.
> Q: Why is top-posting such a bad thing?
> A: Top-posting.
> Q: What is the most annoying thing on usenet and in e-mail?
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
>



-- 
Demelier David
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Core Dump / panic sleeping thread

2013-03-19 Thread Michael Landin Hostbaek
Hi, 

I am running a FreeBSD 9.1-REL system with GENERIC kernel:
FreeBSD x 9.1-RELEASE FreeBSD 9.1-RELEASE #0: Fri Jan  4 12:28:48 CET 2013  
   root@x:/usr/obj/usr/src/sys/GENERIC  amd64


It is crashing a couple of times per week, without any real pattern. There are 
no hints in the syslog, and I only have the core debug to work from...  

It is a webserver, using a NFS mounted docroot (if it might help) - here's the 
backtrace:


This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:
Sleeping thread (tid 100256, pid 85641) owns a non-sleepable lock
KDB: stack backtrace of thread 100256:
#0 0x808f2d46 at mi_switch+0x186
#1 0x8092bb52 at sleepq_wait+0x42
#2 0x808f34d6 at _sleep+0x376
#3 0x80b4f3ae at vm_object_page_remove+0x2ce
#4 0x80b5ac7d at vnode_pager_setsize+0x17d
#5 0x8082102c at nfscl_loadattrcache+0x2cc
#6 0x80818d37 at nfs_getattr+0x287
#7 0x8098f1c0 at vn_stat+0xb0
#8 0x809869d9 at kern_statat_vnhook+0xf9
#9 0x80986b55 at kern_statat+0x15
#10 0x80986c1a at sys_lstat+0x2a
#11 0x80bd7ae6 at amd64_syscall+0x546
#12 0x80bc3447 at Xfast_syscall+0xf7
panic: sleeping thread
cpuid = 0
KDB: stack backtrace:
#0 0x809208a6 at kdb_backtrace+0x66
#1 0x808ea8be at panic+0x1ce
#2 0x8092ed22 at propagate_priority+0x1d2
#3 0x8092fa4e at turnstile_wait+0x1be
#4 0x808d8d48 at _mtx_lock_sleep+0xd8
#5 0x80820fa4 at nfscl_loadattrcache+0x244
#6 0x8081758c at ncl_readrpc+0xac
#7 0x80824c45 at ncl_getpages+0x485
#8 0x80b5aa0c at vnode_pager_getpages+0x9c
#9 0x80b3fc93 at vm_fault_hold+0x673
#10 0x80b41cc3 at vm_fault+0x73
#11 0x80bd84b4 at trap_pfault+0x124
#12 0x80bd8c6c at trap+0x49c
#13 0x80bc315f at calltrap+0x8
Uptime: 8d0h54m10s
Dumping 2381 out of 24547 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%

Reading symbols from /boot/kernel/geom_mirror.ko...Reading symbols from 
/boot/kernel/geom_mirror.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/geom_mirror.ko
Reading symbols from /boot/kernel/geom_stripe.ko...Reading symbols from 
/boot/kernel/geom_stripe.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/geom_stripe.ko
Reading symbols from /boot/kernel/if_em.ko...Reading symbols from 
/boot/kernel/if_em.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/if_em.ko
Reading symbols from /boot/kernel/linprocfs.ko...Reading symbols from 
/boot/kernel/linprocfs.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/linprocfs.ko
Reading symbols from /boot/kernel/linux.ko...Reading symbols from 
/boot/kernel/linux.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/linux.ko
#0  doadump (textdump=Variable "textdump" is not available.
) at pcpu.h:224
224 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) bt
#0  doadump (textdump=Variable "textdump" is not available.
) at pcpu.h:224
#1  0x808ea3a1 in kern_reboot (howto=260) at 
/usr/src/sys/kern/kern_shutdown.c:448
#2  0x808ea897 in panic (fmt=0x1 ) at 
/usr/src/sys/kern/kern_shutdown.c:636
#3  0x8092ed22 in propagate_priority (td=Variable "td" is not available.
) at /usr/src/sys/kern/subr_turnstile.c:227
#4  0x8092fa4e in turnstile_wait (ts=Variable "ts" is not available.
) at /usr/src/sys/kern/subr_turnstile.c:743
#5  0x808d8d48 in _mtx_lock_sleep (m=0xfe044a3c8238, 
tid=18446741888664231936, opts=Variable "opts" is not available.
)
at /usr/src/sys/kern/kern_mutex.c:471
#6  0x80820fa4 in nfscl_loadattrcache (vpp=Variable "vpp" is not 
available.
) at /usr/src/sys/fs/nfsclient/nfs_clport.c:379
#7  0x8081758c in ncl_readrpc (vp=0xfe044a6cd780, 
uiop=0xff86962fc650, cred=Variable "cred" is not available.
)
at /usr/src/sys/fs/nfsclient/nfs_clvnops.c:1369
#8  0x80824c45 in ncl_getpages (ap=0xff86962fc6f0) at 
/usr/src/sys/fs/nfsclient/nfs_clbio.c:171
#9  0x80b5aa0c in vnode_pager_getpages (object=0xfe016aa16570, 
m=0xff86962fc770, count=Variable "count" is not available.
)
at vnode_if.h:1154
#10 0x80b3fc93 in vm_fault_hold (map=0xfe007f7e3188, 
vaddr=34366988288, fault_type=1 '\001', fault_flags=Variable "fault_flags" is 
not available.
)
at vm_pager.h:128
#11 0x80b41cc3 in vm_fault (map=0xfe007f7e3188, vaddr=34366988288, 
fault_type=Variable "fault_type" is not available.
)
at /usr/src/sys/vm/vm_fault.c:229
#12 0x80bd84b4 in trap_pfault (frame=0xff86962fcc40, usermode=1) at 
/usr/src/sys/amd64/amd64/trap.c:740
#13 0x80bd8c6c in trap (frame=0xff86962fcc40) at 
/usr/src/sys/amd64/amd64/trap.c:358
#14 0x80bc315f in calltrap () at 
/usr/src/sys/amd64/amd64/exception.S:228
#15 0x000802091386 in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb) 



Dump header from device /dev/mirror/

Panic : bad pte

2013-03-19 Thread David Demelier
Hello,

There it is, all my computers on FreeBSD 9.1-RELEASE had panic. I can
just say there is a problem in the 9.1-RELEASE because I had no panic
before. What afraid me is that my production server also panic'ed a
few days ago, fortunately it does not appears so often.

This is a panic that happened on my desktop computer, with a graphic
card. The crash usually appears when X starts.

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:
panic: bad pte
cpuid = 3
KDB: stack backtrace:
Uptime: 2m31s
Dumping 183 out of 1950 MB:..9%..18%..27%..35%..44%..53%..62%..79%..88%..96%

Reading symbols from /boot/modules/nvidia.ko...done.
Loaded symbols for /boot/modules/nvidia.ko
#0  doadump (textdump=Variable "textdump" is not available.
) at pcpu.h:224
224 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) bt
#0  doadump (textdump=Variable "textdump" is not available.
) at pcpu.h:224
#1  0x0004 in ?? ()
#2  0x8048c156 in kern_reboot (howto=260) at
/usr/src/sys/kern/kern_shutdown.c:448
#3  0x8048c619 in panic (fmt=0x1 )
at /usr/src/sys/kern/kern_shutdown.c:636
#4  0x8065f88a in pmap_remove_pages (pmap=0xfe0005a2fa60)
at /usr/src/sys/amd64/amd64/pmap.c:4156
#5  0x8063d26b in vmspace_exit (td=0xfe0005a05470) at
/usr/src/sys/vm/vm_map.c:422
#6  0x8045d725 in exit1 (td=0xfe0005a05470, rv=Variable
"rv" is not available.
) at /usr/src/sys/kern/kern_exit.c:315
#7  0x8045e5ce in sys_sys_exit (td=Variable "td" is not available.
) at /usr/src/sys/kern/kern_exit.c:122
#8  0x8066737f in amd64_syscall (td=0xfe0005a05470,
traced=0) at subr_syscall.c:135
#9  0x80652d97 in Xfast_syscall () at
/usr/src/sys/amd64/amd64/exception.S:387
#10 0x000800d51c1c in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb)

Of course I may do something wrong, and I hope so but unfortunately I
can't find any solution. May the nvidia driver be the problem?

Kind regards

--
Demelier David
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Core Dump / panic sleeping thread

2013-03-19 Thread Jeremy Chadwick
On Tue, Mar 19, 2013 at 06:18:06PM +0100, Michael Landin Hostbaek wrote:
> Hi, 
> 
> I am running a FreeBSD 9.1-REL system with GENERIC kernel:
> FreeBSD x 9.1-RELEASE FreeBSD 9.1-RELEASE #0: Fri Jan  4 12:28:48 CET 
> 2013 root@x:/usr/obj/usr/src/sys/GENERIC  amd64
> 
> 
> It is crashing a couple of times per week, without any real pattern. There 
> are no hints in the syslog, and I only have the core debug to work from...  
> 
> It is a webserver, using a NFS mounted docroot (if it might help) - here's 
> the backtrace:
> 
> 
> This GDB was configured as "amd64-marcel-freebsd"...
> 
> Unread portion of the kernel message buffer:
> Sleeping thread (tid 100256, pid 85641) owns a non-sleepable lock
> KDB: stack backtrace of thread 100256:
> #0 0x808f2d46 at mi_switch+0x186
> #1 0x8092bb52 at sleepq_wait+0x42
> #2 0x808f34d6 at _sleep+0x376
> #3 0x80b4f3ae at vm_object_page_remove+0x2ce
> #4 0x80b5ac7d at vnode_pager_setsize+0x17d
> #5 0x8082102c at nfscl_loadattrcache+0x2cc
> #6 0x80818d37 at nfs_getattr+0x287
> #7 0x8098f1c0 at vn_stat+0xb0
> #8 0x809869d9 at kern_statat_vnhook+0xf9
> #9 0x80986b55 at kern_statat+0x15
> #10 0x80986c1a at sys_lstat+0x2a
> #11 0x80bd7ae6 at amd64_syscall+0x546
> #12 0x80bc3447 at Xfast_syscall+0xf7
> panic: sleeping thread
> cpuid = 0
> KDB: stack backtrace:
> #0 0x809208a6 at kdb_backtrace+0x66
> #1 0x808ea8be at panic+0x1ce
> #2 0x8092ed22 at propagate_priority+0x1d2
> #3 0x8092fa4e at turnstile_wait+0x1be
> #4 0x808d8d48 at _mtx_lock_sleep+0xd8
> #5 0x80820fa4 at nfscl_loadattrcache+0x244
> #6 0x8081758c at ncl_readrpc+0xac
> #7 0x80824c45 at ncl_getpages+0x485
> #8 0x80b5aa0c at vnode_pager_getpages+0x9c
> #9 0x80b3fc93 at vm_fault_hold+0x673
> #10 0x80b41cc3 at vm_fault+0x73
> #11 0x80bd84b4 at trap_pfault+0x124
> #12 0x80bd8c6c at trap+0x49c
> #13 0x80bc315f at calltrap+0x8
> Uptime: 8d0h54m10s
> Dumping 2381 out of 24547 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%
> 
> Reading symbols from /boot/kernel/geom_mirror.ko...Reading symbols from 
> /boot/kernel/geom_mirror.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/geom_mirror.ko
> Reading symbols from /boot/kernel/geom_stripe.ko...Reading symbols from 
> /boot/kernel/geom_stripe.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/geom_stripe.ko
> Reading symbols from /boot/kernel/if_em.ko...Reading symbols from 
> /boot/kernel/if_em.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/if_em.ko
> Reading symbols from /boot/kernel/linprocfs.ko...Reading symbols from 
> /boot/kernel/linprocfs.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/linprocfs.ko
> Reading symbols from /boot/kernel/linux.ko...Reading symbols from 
> /boot/kernel/linux.ko.symbols...done.
> done.
> Loaded symbols for /boot/kernel/linux.ko
> #0  doadump (textdump=Variable "textdump" is not available.
> ) at pcpu.h:224
> 224   pcpu.h: No such file or directory.
>   in pcpu.h
> (kgdb) bt
> #0  doadump (textdump=Variable "textdump" is not available.
> ) at pcpu.h:224
> #1  0x808ea3a1 in kern_reboot (howto=260) at 
> /usr/src/sys/kern/kern_shutdown.c:448
> #2  0x808ea897 in panic (fmt=0x1 ) at 
> /usr/src/sys/kern/kern_shutdown.c:636
> #3  0x8092ed22 in propagate_priority (td=Variable "td" is not 
> available.
> ) at /usr/src/sys/kern/subr_turnstile.c:227
> #4  0x8092fa4e in turnstile_wait (ts=Variable "ts" is not available.
> ) at /usr/src/sys/kern/subr_turnstile.c:743
> #5  0x808d8d48 in _mtx_lock_sleep (m=0xfe044a3c8238, 
> tid=18446741888664231936, opts=Variable "opts" is not available.
> )
> at /usr/src/sys/kern/kern_mutex.c:471
> #6  0x80820fa4 in nfscl_loadattrcache (vpp=Variable "vpp" is not 
> available.
> ) at /usr/src/sys/fs/nfsclient/nfs_clport.c:379
> #7  0x8081758c in ncl_readrpc (vp=0xfe044a6cd780, 
> uiop=0xff86962fc650, cred=Variable "cred" is not available.
> )
> at /usr/src/sys/fs/nfsclient/nfs_clvnops.c:1369
> #8  0x80824c45 in ncl_getpages (ap=0xff86962fc6f0) at 
> /usr/src/sys/fs/nfsclient/nfs_clbio.c:171
> #9  0x80b5aa0c in vnode_pager_getpages (object=0xfe016aa16570, 
> m=0xff86962fc770, count=Variable "count" is not available.
> )
> at vnode_if.h:1154
> #10 0x80b3fc93 in vm_fault_hold (map=0xfe007f7e3188, 
> vaddr=34366988288, fault_type=1 '\001', fault_flags=Variable "fault_flags" is 
> not available.
> )
> at vm_pager.h:128
> #11 0x80b41cc3 in vm_fault (map=0xfe007f7e3188, 
> vaddr=34366988288, fault_type=Variable "fault_type" is not available.
> )
> at /usr/src/sys/vm/vm_fault.c:229
> #12 0x80bd84b4 in trap_pfault (frame=0xff86962fcc40, usermode=1) 
> at /usr/src/sys/amd64/amd64/trap.c:740
> #13 0xff

Re: Panic : bad pte

2013-03-19 Thread Jeremy Chadwick
On Tue, Mar 19, 2013 at 06:34:24PM +0100, David Demelier wrote:
> Hello,
> 
> There it is, all my computers on FreeBSD 9.1-RELEASE had panic. I can
> just say there is a problem in the 9.1-RELEASE because I had no panic
> before. What afraid me is that my production server also panic'ed a
> few days ago, fortunately it does not appears so often.
> 
> This is a panic that happened on my desktop computer, with a graphic
> card. The crash usually appears when X starts.
> 
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for details.
> This GDB was configured as "amd64-marcel-freebsd"...
> 
> Unread portion of the kernel message buffer:
> panic: bad pte
> cpuid = 3
> KDB: stack backtrace:
> Uptime: 2m31s
> Dumping 183 out of 1950 MB:..9%..18%..27%..35%..44%..53%..62%..79%..88%..96%
> 
> Reading symbols from /boot/modules/nvidia.ko...done.
> Loaded symbols for /boot/modules/nvidia.ko
> #0  doadump (textdump=Variable "textdump" is not available.
> ) at pcpu.h:224
> 224 pcpu.h: No such file or directory.
> in pcpu.h
> (kgdb) bt
> #0  doadump (textdump=Variable "textdump" is not available.
> ) at pcpu.h:224
> #1  0x0004 in ?? ()
> #2  0x8048c156 in kern_reboot (howto=260) at
> /usr/src/sys/kern/kern_shutdown.c:448
> #3  0x8048c619 in panic (fmt=0x1 )
> at /usr/src/sys/kern/kern_shutdown.c:636
> #4  0x8065f88a in pmap_remove_pages (pmap=0xfe0005a2fa60)
> at /usr/src/sys/amd64/amd64/pmap.c:4156
> #5  0x8063d26b in vmspace_exit (td=0xfe0005a05470) at
> /usr/src/sys/vm/vm_map.c:422
> #6  0x8045d725 in exit1 (td=0xfe0005a05470, rv=Variable
> "rv" is not available.
> ) at /usr/src/sys/kern/kern_exit.c:315
> #7  0x8045e5ce in sys_sys_exit (td=Variable "td" is not available.
> ) at /usr/src/sys/kern/kern_exit.c:122
> #8  0x8066737f in amd64_syscall (td=0xfe0005a05470,
> traced=0) at subr_syscall.c:135
> #9  0x80652d97 in Xfast_syscall () at
> /usr/src/sys/amd64/amd64/exception.S:387
> #10 0x000800d51c1c in ?? ()
> Previous frame inner to this frame (corrupt stack?)
> (kgdb)
> 
> Of course I may do something wrong, and I hope so but unfortunately I
> can't find any solution. May the nvidia driver be the problem?

Interesting timing.  Semi-recently (February) src/sys/amd64/amd64/pmap.c
in 9.1-STABLE (not -RELEASE) was modified to increase the information
shown for this specific type of panic.  See revision 247079:

http://svnweb.freebsd.org/base/stable/9/sys/amd64/amd64/pmap.c?view=log

I've CC'd Konstantin Belousov (kib@), who should be able to help step
you through getting information out of the crash dump, to help track
down the root cause.

-- 
| Jeremy Chadwick   j...@koitsu.org |
| UNIX Systems Administratorhttp://jdc.koitsu.org/ |
| Mountain View, CA, US|
| Making life hard for others since 1977. PGP 4BD6C0CB |
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Core Dump / panic sleeping thread

2013-03-19 Thread Michael Landin Hostbaek

On Mar 19, 2013, at 6:35 PM, Jeremy Chadwick  wrote:

> On Tue, Mar 19, 2013 at 06:18:06PM +0100, Michael Landin Hostbaek wrote:
> The kernel panic is happening in NFS-related code.  Rick Macklem (and/or
> John Baldwin) should be able to help with this; I've CC'd both here.

OK, thanks. 


> 
> You're going to need to provide the following details:
> 
> 1. Contents of /etc/rc.conf

sshd_enable="YES"
ntpdate_enable="YES"
ntpdate_hosts="xx.xx.xx.xx"
fsck_y_enable="YES"
named_enable="YES"
dumpdev="AUTO"
nfs_client_enable="YES"
rpc_lockd_enable="YES"
rpc_statd_enable="YES"
ifconfig_em0="inet xx.xx.xx.xx netmask 255.255.255.0 broadcast xx.xx.xx.xx"
defaultrouter="xx.xx.xx.xx"
hostname=""
cloned_interfaces="vlan"
ifconfig_vlan="inet xx.xx.xx.xx netmask 255.240.0.0 broadcast xx.xx.xx.xx 
vlan  vlandev em0"
apache22_enable="YES"
pureftpd_enable="YES"
revealcloud_enable=YES


> 2. Contents of /etc/sysctl.conf (if modified)

vm.pmap.shpgperproc=250

> 3. Contents of /etc/fstab

# DeviceMountpoint  FStype  Options DumpPass#
/dev/mirror/gm0s1a  /   ufs rw  1   
1
/dev/mirror/gm0s1b  noneswapsw  0   0
/dev/mirror/gm0s1d  /varufs rw  2   
2
/dev/mirror/gm0s1e  /logs   ufs rw  2   
2
/dev/mirror/gm0s1f  /extra  ufs rw  2   
2
/dev/mirror/gm0s1g  /usrufs rw  2   
2
proc/proc   procfs  rw  0   0
xx.xx.xx.xx:/zpool-000xxx/www   /mnt/wwwnfs rw  0   0
xx.xx.xx.xx:/zpool-000xxx/data  /mnt/data   nfs rw,tcp  0   0
linproc /compat/linux/proc  linprocfs   rw  0   0


> 4. ifconfig -a

em0: flags=8843 metric 0 mtu 1500

options=4219b
ether 00:25:90:79:a5:ac
inet xx.xx.xx.xx netmask 0xff00 broadcast xx.xx.xx.xx
inet6 xx::a5ac%em0 prefixlen 64 scopeid 0x1 
nd6 options=29
media: Ethernet autoselect (1000baseT )
status: active
em1: flags=8c02 metric 0 mtu 1500

options=4219b
ether 00:25:90:79:a5:ad
nd6 options=29
media: Ethernet autoselect
status: no carrier
lo0: flags=8049 metric 0 mtu 16384
options=63
inet6 ::1 prefixlen 128 
inet6 fe80::1%lo0 prefixlen 64 scopeid 0xb 
inet 127.0.0.1 netmask 0xff00 
nd6 options=21
vlan: flags=8843 metric 0 mtu 1500
options=103
ether 00:25:90:79:a5:ac
inet xx.xx.xx.xx netmask 0xfff0 broadcast xx.xx.xx.xx
inet6 x:::5ac%vlan prefixlen 64 scopeid 0xc 
nd6 options=29
media: Ethernet autoselect (1000baseT )
status: active
vlan:  parent interface: em0


> 5. OS used by the NFS server, and all configuration details pertaining
> to that system

This is a hosted service, so I do not have access to this - though I believe 
this is a ZFS fs.
Here's more info about the product: http://help.ovh.co.uk/Nas


> 
> You may also be asked to upgrade to 9.1-STABLE, as there may be fixes
> for whatever this is in base/stable/9 that are not in -RELEASE, but this
> is speculative on my part.

That is not a problem. I would simply like to confirm the issue, before 
upgrading. 


Thanks, 

/mich


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Core Dump / panic sleeping thread

2013-03-19 Thread Andriy Gapon
on 19/03/2013 19:35 Jeremy Chadwick said the following:
> On Tue, Mar 19, 2013 at 06:18:06PM +0100, Michael Landin Hostbaek wrote:
[snip]
>> Unread portion of the kernel message buffer:
>> Sleeping thread (tid 100256, pid 85641) owns a non-sleepable lock
>> KDB: stack backtrace of thread 100256:
>> #0 0x808f2d46 at mi_switch+0x186
>> #1 0x8092bb52 at sleepq_wait+0x42
>> #2 0x808f34d6 at _sleep+0x376
>> #3 0x80b4f3ae at vm_object_page_remove+0x2ce
>> #4 0x80b5ac7d at vnode_pager_setsize+0x17d
>> #5 0x8082102c at nfscl_loadattrcache+0x2cc
>> #6 0x80818d37 at nfs_getattr+0x287
>> #7 0x8098f1c0 at vn_stat+0xb0
>> #8 0x809869d9 at kern_statat_vnhook+0xf9
>> #9 0x80986b55 at kern_statat+0x15
>> #10 0x80986c1a at sys_lstat+0x2a
>> #11 0x80bd7ae6 at amd64_syscall+0x546
>> #12 0x80bc3447 at Xfast_syscall+0xf7
>> panic: sleeping thread
>> cpuid = 0
>> KDB: stack backtrace:
>> #0 0x809208a6 at kdb_backtrace+0x66
>> #1 0x808ea8be at panic+0x1ce
>> #2 0x8092ed22 at propagate_priority+0x1d2
>> #3 0x8092fa4e at turnstile_wait+0x1be
>> #4 0x808d8d48 at _mtx_lock_sleep+0xd8
>> #5 0x80820fa4 at nfscl_loadattrcache+0x244
>> #6 0x8081758c at ncl_readrpc+0xac
>> #7 0x80824c45 at ncl_getpages+0x485
>> #8 0x80b5aa0c at vnode_pager_getpages+0x9c
>> #9 0x80b3fc93 at vm_fault_hold+0x673
>> #10 0x80b41cc3 at vm_fault+0x73
>> #11 0x80bd84b4 at trap_pfault+0x124
>> #12 0x80bd8c6c at trap+0x49c
>> #13 0x80bc315f at calltrap+0x8
[snip]

I think that the regular mutex which is acquired via NFSLOCKNODE() in
nfscl_loadattrcache() can not be held across vnode_pager_setsize.
I am not sure though when vap->va_size != np->n_size case is triggered.

> You're going to need to provide the following details:
> 
> 1. Contents of /etc/rc.conf
> 2. Contents of /etc/sysctl.conf (if modified)
> 3. Contents of /etc/fstab
> 4. ifconfig -a
> 5. OS used by the NFS server, and all configuration details pertaining
> to that system
> 
> You may also be asked to upgrade to 9.1-STABLE, as there may be fixes
> for whatever this is in base/stable/9 that are not in -RELEASE, but this
> is speculative on my part.
> 
I do not see a need for any of these.

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Core Dump / panic sleeping thread

2013-03-19 Thread Konstantin Belousov
On Tue, Mar 19, 2013 at 07:45:56PM +0200, Andriy Gapon wrote:
> on 19/03/2013 19:35 Jeremy Chadwick said the following:
> > On Tue, Mar 19, 2013 at 06:18:06PM +0100, Michael Landin Hostbaek wrote:
> [snip]
> >> Unread portion of the kernel message buffer:
> >> Sleeping thread (tid 100256, pid 85641) owns a non-sleepable lock
> >> KDB: stack backtrace of thread 100256:
> >> #0 0x808f2d46 at mi_switch+0x186
> >> #1 0x8092bb52 at sleepq_wait+0x42
> >> #2 0x808f34d6 at _sleep+0x376
> >> #3 0x80b4f3ae at vm_object_page_remove+0x2ce
> >> #4 0x80b5ac7d at vnode_pager_setsize+0x17d
> >> #5 0x8082102c at nfscl_loadattrcache+0x2cc
> >> #6 0x80818d37 at nfs_getattr+0x287
> >> #7 0x8098f1c0 at vn_stat+0xb0
> >> #8 0x809869d9 at kern_statat_vnhook+0xf9
> >> #9 0x80986b55 at kern_statat+0x15
> >> #10 0x80986c1a at sys_lstat+0x2a
> >> #11 0x80bd7ae6 at amd64_syscall+0x546
> >> #12 0x80bc3447 at Xfast_syscall+0xf7
> >> panic: sleeping thread
> >> cpuid = 0
> >> KDB: stack backtrace:
> >> #0 0x809208a6 at kdb_backtrace+0x66
> >> #1 0x808ea8be at panic+0x1ce
> >> #2 0x8092ed22 at propagate_priority+0x1d2
> >> #3 0x8092fa4e at turnstile_wait+0x1be
> >> #4 0x808d8d48 at _mtx_lock_sleep+0xd8
> >> #5 0x80820fa4 at nfscl_loadattrcache+0x244
> >> #6 0x8081758c at ncl_readrpc+0xac
> >> #7 0x80824c45 at ncl_getpages+0x485
> >> #8 0x80b5aa0c at vnode_pager_getpages+0x9c
> >> #9 0x80b3fc93 at vm_fault_hold+0x673
> >> #10 0x80b41cc3 at vm_fault+0x73
> >> #11 0x80bd84b4 at trap_pfault+0x124
> >> #12 0x80bd8c6c at trap+0x49c
> >> #13 0x80bc315f at calltrap+0x8
> [snip]
> 
> I think that the regular mutex which is acquired via NFSLOCKNODE() in
> nfscl_loadattrcache() can not be held across vnode_pager_setsize.
> I am not sure though when vap->va_size != np->n_size case is triggered.

When the file is modified on the server outside of the control of
the client ? E.g., by direct access on the server, or from the other
client.

The only possible solution is to move the vnode_pager_setsize() outside
the scope of the n_mtx. This is somewhat problematic because the nfsiod
threads never bother to lock the vnode, so the truncation of the vm
cache becomes racy. Still, this is probably the best cure.

Another issue I see there is that vnode_pager_setsize() call is only
performed for the VREG nodes. I believe that it is possible to cache
the pages for the directories as well.

Would you work out the patch ?


pgpSjw8_XI0By.pgp
Description: PGP signature


Re: Core Dump / panic sleeping thread

2013-03-19 Thread Rick Macklem
Andriy Gapon wrote:
> on 19/03/2013 19:35 Jeremy Chadwick said the following:
> > On Tue, Mar 19, 2013 at 06:18:06PM +0100, Michael Landin Hostbaek
> > wrote:
> [snip]
> >> Unread portion of the kernel message buffer:
> >> Sleeping thread (tid 100256, pid 85641) owns a non-sleepable lock
> >> KDB: stack backtrace of thread 100256:
> >> #0 0x808f2d46 at mi_switch+0x186
> >> #1 0x8092bb52 at sleepq_wait+0x42
> >> #2 0x808f34d6 at _sleep+0x376
> >> #3 0x80b4f3ae at vm_object_page_remove+0x2ce
> >> #4 0x80b5ac7d at vnode_pager_setsize+0x17d
> >> #5 0x8082102c at nfscl_loadattrcache+0x2cc
> >> #6 0x80818d37 at nfs_getattr+0x287
> >> #7 0x8098f1c0 at vn_stat+0xb0
> >> #8 0x809869d9 at kern_statat_vnhook+0xf9
> >> #9 0x80986b55 at kern_statat+0x15
> >> #10 0x80986c1a at sys_lstat+0x2a
> >> #11 0x80bd7ae6 at amd64_syscall+0x546
> >> #12 0x80bc3447 at Xfast_syscall+0xf7
> >> panic: sleeping thread
> >> cpuid = 0
> >> KDB: stack backtrace:
> >> #0 0x809208a6 at kdb_backtrace+0x66
> >> #1 0x808ea8be at panic+0x1ce
> >> #2 0x8092ed22 at propagate_priority+0x1d2
> >> #3 0x8092fa4e at turnstile_wait+0x1be
> >> #4 0x808d8d48 at _mtx_lock_sleep+0xd8
> >> #5 0x80820fa4 at nfscl_loadattrcache+0x244
> >> #6 0x8081758c at ncl_readrpc+0xac
> >> #7 0x80824c45 at ncl_getpages+0x485
> >> #8 0x80b5aa0c at vnode_pager_getpages+0x9c
> >> #9 0x80b3fc93 at vm_fault_hold+0x673
> >> #10 0x80b41cc3 at vm_fault+0x73
> >> #11 0x80bd84b4 at trap_pfault+0x124
> >> #12 0x80bd8c6c at trap+0x49c
> >> #13 0x80bc315f at calltrap+0x8
> [snip]
> 
> I think that the regular mutex which is acquired via NFSLOCKNODE() in
> nfscl_loadattrcache() can not be held across vnode_pager_setsize.
> I am not sure though when vap->va_size != np->n_size case is
> triggered.
> 
Yep, I'd agree to that. The same bug is in the old NFS client and
the new NFS client cribbed the code from there.

I have attached a simple patch that unlocks the mutex for the
vnode_pager_setsize() call. Maybe you could test it?

Thanks for reporting this, rick
ps: Hopefully "patch" can apply this patch (there have been
recent changes to this file, so the line#s could be off).
It should be easy to do manually if not. The change is
in nfscl_loadattrcache() in sys/fs/nfsclient/nfs_clport.c.


> > You're going to need to provide the following details:
> >
> > 1. Contents of /etc/rc.conf
> > 2. Contents of /etc/sysctl.conf (if modified)
> > 3. Contents of /etc/fstab
> > 4. ifconfig -a
> > 5. OS used by the NFS server, and all configuration details
> > pertaining
> > to that system
> >
> > You may also be asked to upgrade to 9.1-STABLE, as there may be
> > fixes
> > for whatever this is in base/stable/9 that are not in -RELEASE, but
> > this
> > is speculative on my part.
> >
> I do not see a need for any of these.
> 
> --
> Andriy Gapon
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to
> "freebsd-stable-unsubscr...@freebsd.org"
--- fs/nfsclient/nfs_clport.c.savit	2013-03-19 18:37:33.0 -0400
+++ fs/nfsclient/nfs_clport.c	2013-03-19 18:44:21.0 -0400
@@ -444,7 +444,9 @@ nfscl_loadattrcache(struct vnode **vpp, 
 np->n_size = vap->va_size;
 np->n_flag |= NSIZECHANGED;
 			}
+			NFSUNLOCKNODE(np);
 			vnode_pager_setsize(vp, np->n_size);
+			NFSLOCKNODE(np);
 		} else {
 			np->n_size = vap->va_size;
 		}
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"