Hello, I've been trying lately to develop a solution for the problem with open() that manifests itself in ESTALE error in the following situation:
1. NFS server: echo "1111" > file01 2. NFS client: cat file01 3. NFS server: echo "2222" > file02 && mv file02 file01 4. NFS client: cat file01 (either old file01 contents or ESTALE) My study shows that actually the problem appears to be in VOP_ACCESS() which is called from vn_open(). If nfs_access() decides to "go to the wire" in #4, it then uses a cached file handle which is indeed stale. Thus, open() eventually fails with ESTALE too (ESTALE comes from underlying nfs_request()). I understand all the fundamental NFS-related integrity problems, but not this one :) That is, I see no reason for open() to fail to open a file for reading or writing if the system knows the problem is it's own. Why not just do another lookup and try obtain a valid file handle? I was playing with different parts of the kernel while "fixing" this for myself. However, I believe, the simpliest patch would be for vfs_syscalls.c:open() (I've also made a working patch against vn_open(), though). Could anyone please be so kind to comment this issue? TIA --- kern/vfs_syscalls.c.orig Thu Jun 19 13:22:50 2003 +++ kern/vfs_syscalls.c Thu Jun 19 13:29:11 2003 @@ -1008,6 +1008,7 @@ int type, indx, error; struct flock lf; struct nameidata nd; + int stale = 0; oflags = SCARG(uap, flags); if ((oflags & O_ACCMODE) == O_ACCMODE) @@ -1025,8 +1026,15 @@ * the descriptor while we are blocked in vn_open() */ fhold(fp); +again: error = vn_open(&nd, flags, cmode); if (error) { + /* + * if the underlying filesystem returns ESTALE + * we must have used a cached file handle. + */ + if (error == ESTALE && stale++ == 0) + goto again; /* * release our own reference */ _______________________________________________ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"