On 03/22/10 10:52, Steve Polyack wrote:
On 3/19/2010 11:27 PM, Rick Macklem wrote:
On Fri, 19 Mar 2010, Steve Polyack wrote:
[good stuff snipped]
This makes sense. According to wireshark, the server is indeed
transmitting "Status: NFS3ERR_IO (5)". Perhaps this should be STALE
instead; it sounds more correct than marking it a general IO error.
Also, the NFS server is serving its share off of a ZFS filesystem,
if it makes any difference. I suppose ZFS could be talking to the
NFS server threads with some mismatched language, but I doubt it.
Ok, now I think we're making progress. If VFS_FHTOVP() doesn't return
ESTALE when the file no longer exists, the NFS server returns whatever
error it has returned.
So, either VFS_FHTOVP() succeeds after the file has been deleted, which
would be a problem that needs to be fixed within ZFS
OR
ZFS returns an error other than ESTALE when it doesn't exist.
Try the following patch on the server (which just makes any error
returned by VFS_FHTOVP() into ESTALE) and see if that helps.
--- nfsserver/nfs_srvsubs.c.sav 2010-03-19 22:06:43.000000000 -0400
+++ nfsserver/nfs_srvsubs.c 2010-03-19 22:07:22.000000000 -0400
@@ -1127,6 +1127,8 @@
}
}
error = VFS_FHTOVP(mp, &fhp->fh_fid, vpp);
+ if (error != 0)
+ error = ESTALE;
vfs_unbusy(mp);
if (error)
goto out;
Please let me know if the patch helps, rick
The patch seems to fix the bad behavior. Running with the patch, I
see the following output from my patch (return code of nfs_doio from
within nfsiod):
nfssvc_iod: iod 0 nfs_doio returned errno: 70
Furthermore, when inspecting the transaction with Wireshark, after
deleting the file on the NFS server it looks like there is only a
single error. This time there it is a reply to a V3 Lookup call that
contains a status of "NFS3ERR_NOENT (2)" coming from the NFS server.
The client also does not repeatedly try to complete the failed request.
Any suggestions on the next step here? Based on what you said it
looks like ZFS is falsely reporting an IO error to VFS instead of
ESTALE / NOENT. I tried looking around zfs_fhtovp() and only saw
returns of EINVAL, but I'm not even sure I'm looking in the right place.
Further on down the rabbit hole... here's the piece in zfs_fhtovp()
where it's kicking out EINVAL instead of ESTALE - the following patch
corrects the behavior, but of course also suggests further digging
within the zfs_zget() function to ensure that _it_ is returning the
correct thing and whether or not it needs to be handled there or within
zfs_fhtovp().
---
src-orig/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c
2010-03-22 11:41:21.000000000 -0400
+++ src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c
2010-03-22 16:25:21.000000000 -0400
@@ -1246,7 +1246,7 @@
dprintf("getting %llu [%u mask %llx]\n", object, fid_gen, gen_mask);
if (err = zfs_zget(zfsvfs, object, &zp)) {
ZFS_EXIT(zfsvfs);
- return (err);
+ return (ESTALE);
}
zp_gen = zp->z_phys->zp_gen & gen_mask;
if (zp_gen == 0)
_______________________________________________
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"