Re: [zfs-discuss] periodic slow responsiveness

James Lever Sun, 06 Sep 2009 17:07:40 -0700


On 07/09/2009, at 6:24 AM, Richard Elling wrote:

On Sep 6, 2009, at 7:53 AM, Ross Walker wrote:
On Sun, Sep 6, 2009 at 9:15 AM, James Lever<j...@jamver.id.au> wrote:
I’m experiencing occasional slow responsiveness on an OpenSolarisb118
system typically noticed when running an ‘ls’ (no extra flags, so no
directory service lookups). There is a delay of between 2 and 30secondsbut no correlation has been noticed with load on the server andthe slowreturn. This problem has only been noticed via NFS (v3. We aremigratingto NFSv4 once the O_EXCL/mtime bug fix has been integrated -anticipated for
snv_124).  The problem has been observed both locally on the primary
filesystem, in an locally automounted reference (/home/foo) andremotely via
NFS.
I'm confused. If "This problem has only been noticed via NFS (v3"then
how is it "observed locally?”


Sorry, I was meaning to say it had not been noticed using CIFS or iSCSI.

It has been observed in client:/home/user (NFSv3 automount fromserver:/home/user, redirected to server:/zpool/home/user) and also inserver:/home/user (local automount) and server:/zpool/home/user(origin).

iostat(1m) is the program for troubleshooting performance issues
related to latency. It will show the latency of nfs mounts as well as
other devices.

What specifically should I be looking for here? (using ‘iostat -xen -Td’) and I’m guessing I’ll require a high level of granularity (1sintervals) to see the issue if it is a single disk or similar.

stat(2) doesn't write, so you can stop worrying about the slog.

My concern here was I may have been trying to write (via otherconcurrent processes) at the same time as there was a memory faultfrom the ARC to L2ARC.

Rule out the network by looking at retransmissions and ioerrors
with netstat(1m) on both the client and server.


No errors or collisions from either server or clients observed.

That behavior sounds a lot like a process has a memory leak and is
filling the VM. On Linux there is an OOM killer for these, but on
OpenSolaris, your the OOM killer.


See rcapd(1m), rcapadm(1m), and rcapstat(1m) along with the
"Physical Memory Control Using the Resource Capping  Daemon"
in  System Administration Guide: Solaris Containers-Resource
Management, and Solaris Zones


Thanks Richard, I’ll have a look at that today and see where I get.

cheers,
James

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] periodic slow responsiveness

Reply via email to