At 12:24 PM -0800 12/16/01, Matthew Dillon wrote:

>    program runs fine in an overnight test.  We still have a known issue
>    with out-of-order operations from nfsiod's that apparently may come
>    up after a week or so of testing.  I asked Jordan to try to track down
>    the NeXT guy who fixed that one in the old NFS stack.

This bug showed up recently here with fsx testing.  I seem to have fixed it
last week in MacOS X.  The diffs were widespread but the idea was simple
enough so a few code snippets should suffice:

nfs_request gets a new argument (u_int64_t *xidp) and fills it in here:

        m = nfsm_rpchead(cred, nmp->nm_flag, procnum, auth_type, auth_len,
             auth_str, verf_len, verf_str, mrest, mrest_len, &mheadend, &xid);
        if (xidp)
                *xidp = xid + ((u_int64_t)nfs_xidwrap << 32);

nfsm_rpchead bumps nfs_xidwrap when avoiding a zero xid.

Callers of nfs_request take the returned xid and pass it via the macros to
nfs_loadattrcache, from which the following code is snipped:

        if (*xidp < np->n_xid) {
                /*
                 * We have already updated attributes with a response from
                 * a later request.  The attributes we have here are probably
                 * stale so we drop them (just return).  However, our
                 * out-of-order receipt could be correct - if the requests were
                 * processed out of order at the server.  Given the uncertainty
                 * we invalidate our cached attributes.  *xidp is zeroed here
                 * to indicate the attributes were dropped - only getattr
                 * cares - it needs to retry the rpc.
                 */
                np->n_attrstamp = 0;
                *xidp = 0;
                return (0);
        }

Further down in nfs_loadattrcache:

        np->n_xid = *xidp;

Note xids are kept in a 64 bit form so relative comparison won't fail in
the unlikely case that xids wrap around zero.

Here's the change in nfs_getattr.  I don't expect to ever see the panic.

        avoidfloods = 0;
tryagain:
        nfsstats.rpccnt[NFSPROC_GETATTR]++;
        nfsm_reqhead(vp, NFSPROC_GETATTR, NFSX_FH(v3));
        nfsm_fhtom(vp, v3);
        nfsm_request(vp, NFSPROC_GETATTR, ap->a_p, ap->a_cred, &xid);
        if (!error) {
                nfsm_loadattr(vp, ap->a_vap, &xid);
                if (!xid) { /* out-of-order rpc - attributes were dropped */
                        m_freem(mrep);
                        if (avoidfloods++ < 100)
                                goto tryagain;
                        /*
                         * avoidfloods>1 is bizarre.  at 100 pull the plug
                         */
                        panic("nfs_getattr: getattr flood\n");
                }






--
Conrad Minshall, [EMAIL PROTECTED], 408 974-2749
Apple Computer, Mac OS X Core Operating Systems



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Reply via email to