On Tue, 29 Apr 2014, Philip Guenther wrote:
> On Tue, Apr 29, 2014 at 8:17 AM, Donovan Watteau <tso...@gmail.com> wrote:
> > I have various mountpoints from a NetApp NFS server with I use on
> > OpenBSD/amd64 5.5.
> >
> > $ grep nfs /etc/fstab
> >    server:/vol/foobar /vol/foobar nfs 
> > noauto,rw,nodev,nosuid,noatime,noexec,nfsv3,tcp,soft,intr,noac,-x=300,-t=1000,acregmin=3,acregmax=5,-r=65536,-w=65536
> >  0 0
> >    (and some other mountpoints with the same options)
> 
> That's a lot of knob turning.  What documentation or testing led to
> you adding the tcp, noac, ac*, -x, -t, -r, and -w options?

Indeed, I don't like turning knobs either, but this problem still
appears with a much simpler fstab (see below).

My documentation is mount_nfs(8), and "Managing NFS and NIS"
(recommended by books.html).

Basically:
* tcp: better suited our use case, with a noticeable speed improvement
  and a better reliability regarding the files we need to go through
  NFS.
* noac: a leftover, but removing it doesn't fix the problem.
* ac: required for our use case.
* -x/-t: we needed a faster timeout/retry rate, but it may be too high.
* -r/-w: gave a noticeable speed improvement for the typical size of
  the files going through NFS.

But anyway, this much simpler configuration on a clean installation
still exposes the problem:

$ grep nfs /etc/fstab
    server:/vol/foobar /vol/foobar nfs noauto,rw,tcp 0 0

> > However, when I do a simple "ls /vol/foobar" after an hour without
> > anything else using this mountpoint, this appears in the logs:
> >
> >     Apr 29 13:53:46 puffy /bsd: receive error 54 from nfs server 
> > server:/vol/foobar
> >     Apr 29 13:53:48 puffy last message repeated 833 times
> >
> > $ grep 54 /usr/include/sys/errno.h
> > #define ECONNRESET      54              /* Connection reset by peer */
> 
> Is there an idle timeout on the server or flaky network (NAT?) between
> this client and the server?

There is no NAT, but there's a dedicated VLAN running on top of a LACP
trunk(4).

As for the server: I'm not aware of any idle timeout being set there,
but it's running Data ONTAP, whose documentation doesn't match
OpenBSD's one...

> > ls(1) gets slowed down a bit, but works.  The next ls(1) invocations
> > work fine, unless I stop using the mountpoint for about an hour.
> > This also happens when security(8) is called during the night
> > (when /vol/foobar isn't used for hours).
> >
> > Is it harmless or is there a real problem?  Debian on the same machine
> > doesn't have this, but maybe OpenBSD is just a bit verbose about it?
> 
> TCP connection is being dropped for some reason and then it takes a
> moment to be reopened when you try to use it again.

Yes, I was wondering whether there is something left to be configured
on the OpenBSD side to prevent that (since the problem doesn't show up
on Debian running on the same machine), or should I look for a problem
on the NFS server or Cisco side?

Thank you.

Reply via email to