I've been hunting around for a way to safely detect hung and stale NFS
mounts, and can't find anything that works in all cases.  I'm using
version 3.0.4 and 3.0.5.

Here's what I *can* do:

* detect unmounted filesystems, and mount them
* detect *stale* filesystems, umount and remount them

Both of these can be done via storage: actions, or more manual
commands: actions.

What I can't do is safely detect the case of a truely hung NFS mount.
By "hung," I refer to an NFS mount where the server disappeared, and the
clients still have shares mounted.  This is distinguished from a simply
"stale" mount where the server may have rebooted, but has come back
online and can service NFS requests again.

What I want to avoid is a fork-bomb type problem where cf-agent
continually starts copies of small programs to check for a hung mount
(like 'df' or 'mount').

I'm running this on Linux boxes, so I can (and do) check /proc/mounts as
a non-blocking way to see if something is mounted.  However, this won't
tell me if the mount is stale; to do that, I can check the return code
of 'df' (it returns 1 on a stale mount).

Unfortunately, in the case of *hung* mount, it never returns...

Anyone have any clever ideas or suggestions?

-- 
Jesse Becker
NHGRI Linux support (Digicon Contractor)
_______________________________________________
Help-cfengine mailing list
Help-cfengine@cfengine.org
https://cfengine.org/mailman/listinfo/help-cfengine

Reply via email to