On Fri, 19 Mar 2010, John Baldwin wrote:

On Friday 19 March 2010 7:34:23 am Steve Polyack wrote:
Hi, we use a FreeBSD 8-STABLE (from shortly after release) system as an
NFS server to provide user home directories which get mounted across a
few machines (all 6.3-RELEASE).  For the past few weeks we have been
running into problems where one particular client will go into an
infinite loop where it is repeatedly trying to write data which causes
the NFS server to return "reply ok 40 write ERROR: Input/output error
PRE: POST:".  This retry loop can cause between 20mbps and 500mbps of

I'm afraid I don't quite understand what you mean by "causes the NFS
server to return "reply ok 40 write ERROR..."". Is this something
logged by syslog (I can't find a printf like this in the kernel
sources) or is this something that tcpdump is giving you or ???

Why I ask is that it seems to say that the server is returning EIO
(or maybe 40 == EMSGSIZE).

The server should return ESTALE (NFSERR_STALE) after a file has
been deleted. If it is returning EIO, then that will cause the
client to keep trying to write the dirty block to the server.
(EIO is interpreted by the client as a "transient error".)

[good stuff snipped]

I have a feeling that using NFS in such a matter may simply be prone to
such problems, but what confuses me is why the NFS client system is
infinitely retrying the write operation and causing itself so much grief.

Yes, your feeling is correct.  This sort of race is inherent to NFS if you do
not use some sort of locking protocol to resolve the race.  The infinite
retries sound like a client-side issue.  Have you been able to try a newer OS
version on a client to see if it still causes the same behavior?

As John notes, having one client delete a file while another is trying
to write it, is not a good thing.

However, the server should return ESTALE after the file is deleted and
that tells the client that the write can never succeed, so it marks the
buffer cache block invalid and returns the error to the app. (The app.
may not see it, if it doesn't check for error returns upon close as well
as write, but that's another story...)

If you could look at a packet trace via wireshark when the problem
occurs, it would be nice to see what the server is returning. (If it
isn't ESTALE and the file no longer exists on the server, then thats
a server problem.) If it is returning ESTALE, then the client is busted.
(At a glance, the client code looks like it would handle ESTALE as a
fatal error for the buffer cache, but that doesn't mean it isn't broken,
just that it doesn't appear wrong. Also, it looks like mmap'd writes
won't recognize a fatal write error and will just keep trying to write
the dirty page back to the server. Take this with a big grain of salt,
since I just took a quick look at the sources. FreeBSD6->8 appear to
be pretty much the same as far as this goes, in the client.

Please let us know if you can see the server's error reply code.

Good luck with it, rick
ps: If the server isn't returning ESTALE, you could try switching to
    the experimental nfs server and see if it exhibits the same behaviour?
    ("-e" option on both mountd and nfsd, assuming the server is
     FreeBSD8.)
_______________________________________________
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"

Reply via email to