On 07/01/10 15:23, Garrett Cooper wrote:
On Thu, Jul 1, 2010 at 11:51 AM, alan bryan<alan.br...@yahoo.com>  wrote:

--- On Thu, 7/1/10, Garrett Cooper<yanef...@gmail.com>  wrote:

From: Garrett Cooper<yanef...@gmail.com>
Subject: Re: NFS 75 second stall
To: "alan bryan"<alan.br...@yahoo.com>
Cc: freebsd-stable@freebsd.org
Date: Thursday, July 1, 2010, 11:13 AM
On Thu, Jul 1, 2010 at 11:01 AM, alan
bryan<alan.br...@yahoo.com>
wrote:
Setup:

server - FreeBSD 8-stable from today.  2 UFS dirs
exported via NFS.
client - FreeBSD 8.0-Release.  Running a test php
script that copies around various files to/from 2 separate
NFS mounts.
Situation:

script is started (forked to do 20 simultaneous runs)
and 20 1GB files are copied to the NFS dir which works
fine.  When it then switches to reading those files back
and simultaneously writing to the other NFS mount I see a
hang of 75 seconds.  If I do an "ls -l" on the NFS mount it
hangs too.  After 75 seconds the client has reported:
nfs server 192.168.10.133:/usr/local/export1: not
responding
nfs server 192.168.10.133:/usr/local/export1: is alive
again
nfs server 192.168.10.133:/usr/local/export1: not
responding
nfs server 192.168.10.133:/usr/local/export1: is alive
again
and then things start working again.  The server was
originally FreeBSD 8.0-Release also but was upgraded to the
latest stable to see if this issue could be avoided.
# nfsstat -s -W -w 1
  GtAttr Lookup Rdlink   Read  Write Rename
Access  Rddir
       0      0      0    222    257
   0      0      0
       0      0      0    178    135
   0      0      0
       0      0      0     85    127
     0      0      0
       0      0      0      0      0
     0      0      0
       0      0      0      0      0
     0      0      0
       0      0      0      0      0
     0      0      0
       0      0      0      0      0
     0      0      0
       0      0      0      0      0
     0      0      0
... for 75 rows of all zeros

       0      0      0    272    266
   0      0      0
       0      0      0    167    165
   0      0      0
I also tried runs with 15 simultaneous processes and
25.  15 processes gave only about a 5 second stall but 25
gave again the same 75 second stall.
Further, I tested with 2 mounts to the same server but
from ZFS filesytems with the exact same stall/timeout
periods.  So, it doesn't appear to matter what the
underlying filesystem is - it's something in NFS or
networking code.
Any ideas on what's going on here?  What's causing
the complete stall period of zero NFS activity?   Any flaws
with my testing methods?
Thanks for any and all help/ideas.
What network driver are you using? Have you tried
tcpdumping the packets?
-Garrett

I'm using igb currently but have also used em.  I have not tried tcpdumping the 
packets yet on this test.  Any suggestions on things to look out for (I'm not 
that familiar with that whole process).

Which brings up another point - I'm using TCP connections for NFS, not UDP.
     Is the net.inet.tcp.tso sysctl enabled or not? What about rxcsum and 
txcsum?
Thanks,
-Garrett

We're occaisionally seeing these same types of stalls (+ repeated "is not responding" "is alive again" messages in quick succession). We're seeing it only on our 8.1-RELEASE systems against a variety of NFS servers (6.3-RELEASE, 7.2-RELEASE, and 8-STABLE from before the release of 8.1). We also see it happen with a variety of client hardware and network adapters (em, bce, bge); the only common denominator is 8.1-RELEASE on the clients.

_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Reply via email to