Re: [Linux-HA] problem with nfs and exportfs failover

William Seligman Mon, 16 Apr 2012 09:50:21 -0700

On 4/14/12 5:55 AM, emmanuel segura wrote:
> Maybe the problem it's the primitive nfsserver lsb:nfs-kernel-server, i
> think this primitive was stoped befoure exportfs-admin
> ocf:heartbeat:exportfs
> 
> And if i rember the lsb:nfs-kernel-server and exportfs agent does the same
> thing
> 
> the first use the os scripts and the second the cluster agents


Now that Emmanuel has reminded me, I'll offer two more tips based on advice he's
given me in the past:

- You can deal with issue he raises directly by putting additional constraints
in your setup, something like:

colocation fs-homes-nfsserver inf: group-homes clone-nfsserver
order nfssserver-before-homes inf: clone-nfsserver group-homes

That will make sure that all the group-homes resources (including
exportfs-admin) will not be run unless an instance of nfsserver is already
running on that node.

- There's a more fundamental question: Why are you placing the start/stop of
your NFS server on both nodes under pacemaker control? Why not have the NFS
server start at system startup on each node?

The only reason I see for putting NFS under Pacemaker control is if there are
entries in your /etc/exports file (or the Debian equivalent) that won't work
unless other Pacemaker-controlled resources are running, such as DRBD. If that's
the case, you're better off controlling them with Pacemaker exportfs resources,
the same as you're doing with exportfs-admin, instead of /etc/exports entries.

> Il giorno 14 aprile 2012 01:50, William Seligman <
> [email protected]> ha scritto:
> 
>> On 4/13/12 7:18 PM, William Seligman wrote:
>>> On 4/13/12 6:42 PM, Seth Galitzer wrote:
>>>> In attempting to build a nice clean config, I'm now in a state where
>>>> exportfs never starts.  It always times out and errors.
>>>>
>>>> crm config show is pasted here: http://pastebin.com/cKFFL0Xf
>>>> syslog after an attempted restart here: http://pastebin.com/CHdF21M4
>>>>
>>>> Only IPs have been edited.
>>>
>>> It's clear that your exportfs resource is timing out for the admin
>> resource.
>>>
>>> I'm no expert, but here are some "stupid exportfs tricks" to try:
>>>
>>> - Check your /etc/exports file (or whatever the equivalent is in Debian;
>>> "man exportfs" will tell you) on both nodes. Make sure you're not already
>>> exporting the directory when the NFS server starts.
>>>
>>> - Take out the exportfs-admin resource. Then try doing things manually:
>>>
>>> # exportfs x.x.x.0/24:/exports/admin
>>>
>>> Assuming that works, then look at the output of just
>>>
>>> # exportfs
>>>
>>> The clientspec reported by exportfs has to match the clientspec you put
>>> into the resource exactly. If exportfs is canonicalizing or reporting the
>>> clientspec differently, the exportfs monitor won't work. If this is the
>>> case, change the clientspec parameter in exportfs-admin to match.
>>> 
>>> If the output of exportfs has any results that span more than one line,
>>> then you've got the problem that the patch I referred you to (quoted
>>> below) is supposed to fix. You'll have to apply the patch to your
>>> exportfs resource.
>>
>> Wait a second; I completely forgot about this thread that I started:
>>
>> <http://www.gossamer-threads.com/lists/linuxha/users/78585>
>>
>> The solution turned out to be to remove the .rmtab files from the 
>> directories I was exporting, deleting & touching /var/lib/nfs/rmtab (you'll
>> have to look up the Debian location), and adding rmtab_backup="none" to all
>> my exportfs resources.
>>
>> Hopefully there's a solution for you in there somewhere!
>>
>>>> On 04/13/2012 01:51 PM, William Seligman wrote:
>>>>> On 4/13/12 12:38 PM, Seth Galitzer wrote:
>>>>>> I'm working through this howto doc:
>>>>>> http://www.linbit.com/fileadmin/tech-guides/ha-nfs.pdf
>>>>>> and am stuck at section 4.4.  When I put the primary node in standby, it
>>>>>> seems that NFS never releases the export, so it can't shut down, and
>>>>>> thus can't get started on the secondary node.  Everything up to that
>>>>>> point in the doc works fine and fails over correctly.  But once I add
>>>>>> the exportfs resource, it fails.  I'm running this on debian wheezy with
>>>>>> the included standard packages, not custom.
>>>>>>
>>>>>> Any suggestions?  I'd be happy to post configs and logs if requested.
>>>>>
>>>>> Yes, please post the output of "crm configure show", the output of
>>>>> "exportfs" while the resource is running properly, and the relevant
>>>>> sections of your log file. I suggest using pastebin.com, to keep
>>>>> mailboxes filling up with walls of text.
>>>>>
>>>>> In case you haven't seen this thread already, you might want to take a 
>>>>> look:
>>>>>
>>>>> <http://www.gossamer-threads.com/lists/linuxha/dev/77166>
>>>>>
>>>>> And the resulting commit:
>>>>> <https://github.com/ClusterLabs/resource-agents/commit/5b0bf96e77ed3c4e179c8b4c6a5ffd4709f8fdae>
>>>>>
>>>>> (Links courtesy of Lars Ellenberg.)
>>>>>
>>>>> The problem and patch discussed in those links doesn't quite match
>>>>> what you describe. I mention it because I had to patch my exportfs
>>>>> resource (in /usr/lib/ocf/resource.d/heartbeat/exportfs on my RHEL
>>>>> systems) to get it to work properly in my setup.

-- 
Bill Seligman             | Phone: (914) 591-2823
Nevis Labs, Columbia Univ | mailto://[email protected]
PO Box 137                |
Irvington NY 10533 USA    | http://www.nevis.columbia.edu/~seligman/

smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] problem with nfs and exportfs failover

Reply via email to