On Mon, Jun 28, 2010 at 12:30:30AM -0400, Rick Macklem wrote:
> 
> I can't explain the corruption, beyond the fact that "soft,intr" can
> cause all sorts of grief. If mounts without "soft,intr" still show
> corruption problems, try disabling delegations (either kill off the
> nfscbd daemons on the client or set vfs.newnfs.issue_delegations=0
> on the server). It is disabled by default because it is the "greenest"
> part of the subsystem.

I tried without soft,intr and "make buildworld" failed with what looks like
file corruption again.  I'm trying without delegations now.

> Make sure you don't have multiple entries for the same uid, such as "root"
> and "toor" both for uid 0 in your /etc/passwd. (ie. get rid of one of 
> them, if you have both)

Hmm, that's a strange requirement, since FreeBSD by default comes with
both.  That should probably be documented in the nfsv4 man page.

> When you specify "nfs" for an NFSv3 mount, you get the regular client.
> When you specify "newnfs" for an NFSv3 mount, you get the experimental
> client. When you specify "nfsv4" you always get the experimental NFS
> client, and it doesn't matter which FStype you've specified.

Ok.  So my comparison was with the regular and experimental clients.

> If you are using UFS/FFS on the server, this should work and I don't know
> why the empty directories under /vol on the client confused it. If your
> server is using ZFS, everything from / including /vol need to be exported.

Nope, UFS2 only (on both clients and server).

> >     kernel: nfsv4 client/server protocol prob err=10020
> 
> This error indicates that there wasn't a valid FH for the server. I
> suspect that the mount failed. (It does a loop of Lookups from "/" in
> the kernel during the mount and it somehow got confused part way through.)

If the mount failed, why would it allow me to "ls /vol/a" and see both "b"
and "c" directories as well as other files/directories on /vol/ ?

> I don't know why these empty dirs would confuse it. I'll try a test
> here, but I suspect the real problem was that the mount failed and
> then happened to succeed after you deleted the empty dirs.

It doesn't seem likely.  I spent an hour mounting and unmounting and each
mount looked successful in that there were files and directories besides
the two I was trying to decend into.

> It still smells like some sort of transport/net interface/... issue
> is at the bottom of this. (see response to your next post)

It's possible.  I just had another NFSv4 client (with the same server) lock
up:

load: 0.00  cmd: ls 17410 [nfsv4lck] 641.87r 0.00u 0.00s 0% 1512k

and:

load: 0.00  cmd: make 87546 [wait] 37095.09r 0.01u 0.01s 0% 844k

That make has been hung for hours, and the ls(1) was executed during that
lockup.  I wish there was a way I could unhang these processes and unmount
the NFS mount without panicking the kernel, but alas even this fails:

# umount -f /sw
load: 0.00  cmd: umount 17479 [nfsclumnt] 1.27r 0.00u 0.04s 0% 788k

A "shutdown -p now" resulted in a panic with the speaker beeping
constantly and no console output.

It's possible the NICs are all suspect, but all of this worked fine a
couple of days ago when I was only using NFSv3.

-- Rick C. Petty
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Reply via email to