On Wed, 5 Jul 2006, Francisco Reyes wrote:

can you trigger it using work on just one client against a server, without client<->client interactions? This makes tracking and reproduction a lot easier

Personally I am experiencing two problems.
1- NFS clients freeze/hang if the server goes away.
We have clients with several mounts so if one of the servers dies then the entire operation of the client is put in jeopardy.

This I can reproduce every single time with a 6.X client.. with both a 5.X and a 6.X server.

"umount -f" hangs too.

The problems you are experiencing are almost certainly not related to rpc.lockd, rather, bugs in the NFS client.

Let's just look at the normal use hang for now, and revisit umount -f after that.

as multi-client test cases are really tricky!

The second case only happens under heavy load and restarting nfsd makes it go away. Basically 'b' column in vmstat goes high and the performnance of the machine falls to the floor.

Going to try http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneld ebug-deadlocks.html

And reading up on how to debug with DDB. Have another user who volunteered to give me some pointers.. so will try that.. so I am able to actually produce more helpfull info.

If you can get into DDB when the hang has occurred, output via serial console for the following commands would be very helpful:

show pcpu
show allpcpu
ps
trace
traceall
show locks
show alllocks
show uma
show malloc
show lockedvnods

Note that the last two will only work if you compile WITNESS in -- WITNESS significantly changes kernel timing, so you may find it closes whatever race you're running into. If you can reproduce the problem with WITNESS and INVARIANTS, that would be very useful. The above output will hopefully tell us the basic state of the system with respect to processes, threads, locking, and so on, and may help us track things down. For the above, you definitely want a serial console as it will be quite a bit of output.

Also, can you send the output of the 'mount' command from the un-hung state? I notice a lot of threads stuck in 'ufs'.

Finally, during the above, if you could disable background file system checking by placing the following in /etc/rc.conf:

  background_fsck="NO"

And boot to single user mode, doing a full fsck -p before booting up, in order to make sure the file system is in a good state before beginning.

Robert N M Watson
Computer Laboratory
University of Cambridge
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to