Rick, 

Thanks for the comments.  I'm running a small "home lab" environment, so the 
ESXi client is the only one I'm concerned with right now.  I'll keep using the 
ReclaimComplete patch as is.  Definitely had problems with the NFS server 
rebooting before I applied the other commits, but that all seems to work fine 
now.  

If it helps, I'm not seeing any "OpenDownGrade"calls in a quick experiment 
mounting and browsing a test share (attached).  

Thanks again,
Daniel


On Sun, Jul 8, 2018, at 7:10 PM, Rick Macklem wrote:
> Daniel Engel wrote:
> [stuff snipped]
> >I traced the commits that Rick has made since that thread and merged them 
> >'head' >into 'stable':
> >
> >    'svnlite checkout http://svn.freebsd.org/base/release/11.1.0/'
> >    'svnlite merge -c 332790 http://svn.freebsd.org/base/head'
> >    'svnlite merge -c 333508 http://svn.freebsd.org/base/head'
> >    'svnlite merge -c 333579 http://svn.freebsd.org/base/head'
> >    'svnlite merge -c 333580 http://svn.freebsd.org/base/head'
> >    'svnlite merge -c 333592 http://svn.freebsd.org/base/head'
> >    'svnlite merge -c 333645 http://svn.freebsd.org/base/head'
> >    'svnlite merge -c 333766 http://svn.freebsd.org/base/head'
> >    'svnlite merge -c 334396 http://svn.freebsd.org/base/head'
> >    'svnlite merge -c 334492 http://svn.freebsd.org/base/head'
> >    'svnlite merge -c 327674 http://svn.freebsd.org/base/head'
> Yes, you have all the commits to head related to the 4.1 server that 
> might affect
> the ESXi client, plus a bunch that should be harmless, but I don't think 
> affect
> the ESXi client mounts. (Most of these will get MFC'd to stable/11, but 
> I haven't
> gotten around to it yet.)
> 
> The ones that might be in 6.7 (they were in 6.5) that may bite you are:
> - The client does an OpenDownGrade with all OPEN_SHARE_ACCESS and
>    OPEN_SHARE_DENY bits set for something it calls a "drive lock".
>   (Adding bits is supposed to be done via an Open/ClaimNull and not
>    OpenDowngrade.) I'd really like to know if this still happens for 6.7?
> - Something about "directory modified too often" when doing deletion of a 
> bunch
>   of files. (I have no idea what this one means, but apparently it was seen 
> for
>   other NFSv4.1 servers.)
> - Some warnings about "wrong reason for not issuing a delegation". I have a 
> fix
>   for this one in PR#226650, but they are just warnings and don't seem to
>   matter much.
> 
> The rest of the really nasty stuff happens after a server reboot. The 
> recovery code
> seemed to be badly broken in the 6.5 client. (All sorts of fun stuff 
> like the client
> looping doiing ExchangeID operations forever. VM crashes...)
> 
> >That completely fixed the connection instability, but the NFS share was 
> >still mounting >read-only with a RECLAIM_COMPLETE error.  So, I manually 
> >applied the first patch >from the previous thread and everything started 
> >working:
> >
> >    --- fs/nfsserver/nfs_nfsdserv.c.savrecl     2018-02-10 
> > 20:34:31.166445000 -0500
> >    +++ fs/nfsserver/nfs_nfsdserv.c     2018-02-10 20:36:07.947490000 -0500
> >    @@ -4226,10 +4226,9 @@ nfsrvd_reclaimcomplete(struct nfsrv_desc
> >            goto nfsmout;
> >        }
> >        NFSM_DISSECT(tl, uint32_t *, NFSX_UNSIGNED);
> >    +   nd->nd_repstat = nfsrv_checkreclaimcomplete(nd);
> >        if (*tl == newnfs_true)
> >    -           nd->nd_repstat = NFSERR_NOTSUPP;
> >    -   else
> >    -           nd->nd_repstat = nfsrv_checkreclaimcomplete(nd);
> >    +           nd->nd_repstat = 0;
> I think this patch is ok to use, since no other extant client does a 
> ReclaimComplete
> with "one_fs == true". It does kinda violate the RFC.
> The problem is that FreeBSD exports a hierarchy of file systems and 
> telling the
> server that one of them has been reclaimed is useless. (This hack just 
> assumes
> the client meant to say "one_fs == false".)
> There was also a case (I think it was after a server reboot) where the 
> client would
> do one of these after doing a ReclaimComplete with "one_fs == false" and 
> that is
> definitely bogus (the server would reply NFS4ERR_ALREADY_COMPLETE 
> without
> the above hack) since the "one_fs == false" operation means all file 
> systems have
> been reclaimed.
> 
> Anyhow, once I get some packet traces from Andreas for 6.7, I'll try and 
> figure
> out how to handle at least some of the outstanding issues.
> 
> Good luck with it, rick
> 

Attachment: erebor-install-20180708-nfsd.pcap
Description: Binary data

_______________________________________________
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Reply via email to