I'd say this is more likely a VM bug rather then an NFS bug. I don't
think anything is being corrupted, I think it may just be a deadlock.
For this sort of problem you should be able to gdb the kernel live on
the system (without core'ing it) and then look at the stack backtrace
for the processes in question. You need a debug version of the kernel
binary that is running.
gdb -k kernel.debug /dev/mem
proc 88194
back
proc 90317
back
Also do your ps axl again and make sure there aren't any other processes
stuck in an odd state, looking at the code I don't see a possible deadlock
between those two blocking states.
-Matt
Matthew Dillon
<[EMAIL PROTECTED]>
:This is only the second time ever this has happened, but it is still an
:interesting problem... I have a large number of "emacs" processes stuck in
:disk-wait. Here is the ps axl line for one such process:
:
:33639 88194 1 0 -22 0 5856 340 vmpfw D qi- 0:01.34 emacs proxy.
:
:Any attempt to access emacs on the client system would result in a type of
:hang for that process. Here is a 'cat /usr/local/bin/emacs >/dev/null':
:
:2371 90317 1 3 -18 0 268 8 pgtblk D p4- 0:00.02 cat /usr/loc
:
:To "fix" this I went to the NFS server and 'cp emacs emacs.new;rm emacs;mv emacs.new
:emacs'. In essence forcing a new FH. The old procs still stick arround.
:
:This leads me to believe the problem is entirely on the local system (ie,
:the kernel isn't asking for pages from the NFS server for that FH)
:
:Any ideas what could be corrupting the local cache (I am assuming that is the
:problem) like this. Nothing of note in the dmesg.
:
:--
:David Cross | email: [EMAIL PROTECTED]
To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message