Just a quick update. I set the wait_for_leasetime_on_stop parameter on the exportfs resource to false, no it no longer sleeps for 92 sec and the switchover is instantaneous. Now I just need to figure out how to disable nfsv4 on the server side and I should be home-free.
Thanks. Seth On 04/16/2012 12:42 PM, Seth Galitzer wrote: > I've been poking at this more over the weekend and this morning. And > while your tip about rmtab was useful, it still didn't resolve the > problem. I also made sure that my exports were only being > handled/defined by pacemaker and not by /etc/exports. Though for the > cloned nfsserver resource to work, it seems you need an /etc/exports > file to exist on the server, even if it's empty. > > It seems the clue as to what's going on is in this line from the log: > > coronado exportfs[20325]: INFO: Sleeping 92 seconds to accommodate for > NFSv4 lease expiry > > If I bump up the timeout for the exportfs resource to 95 sec, then after > the very long timeout, it switches over correctly. So while this is a > working solution to the problem, a 95 sec timeout is a little long for > my personal comfort on a live and active fileserver. Any idea what is > instigating this timeout? Is is exportfs (looks that way from the log > entry), nfsd, or pacemaker? If pacemaker, then where can I reduce or > remove this? > > I've been looking at disabling nfsv4 entirely on this server, as I don't > really need it, but haven't found a solution that works yet. Tried the > suggestion in this thread, but it seems to be for mounts, not nfsd, and > still doesn't help: > http://lists.debian.org/debian-user/2011/11/msg01585.html > > Though I have found that v4 is being loaded on one host but not the > other. So if I can find what's different, I may be able to make that work. > > coronado:~# rpcinfo -u localhost nfs > program 100003 version 2 ready and waiting > program 100003 version 3 ready and waiting > program 100003 version 4 ready and waiting > > cascadia:~# rpcinfo -u localhost nfs > program 100003 version 2 ready and waiting > program 100003 version 3 ready and waiting > > Any further suggestions are welcome. I'll keep poking until I find a > solution. > > Thanks. > Seth > > On 04/16/2012 11:49 AM, William Seligman wrote: >> On 4/14/12 5:55 AM, emmanuel segura wrote: >>> Maybe the problem it's the primitive nfsserver lsb:nfs-kernel-server, i >>> think this primitive was stoped befoure exportfs-admin >>> ocf:heartbeat:exportfs >>> >>> And if i rember the lsb:nfs-kernel-server and exportfs agent does the same >>> thing >>> >>> the first use the os scripts and the second the cluster agents >> >> Now that Emmanuel has reminded me, I'll offer two more tips based on advice >> he's >> given me in the past: >> >> - You can deal with issue he raises directly by putting additional >> constraints >> in your setup, something like: >> >> colocation fs-homes-nfsserver inf: group-homes clone-nfsserver >> order nfssserver-before-homes inf: clone-nfsserver group-homes >> >> That will make sure that all the group-homes resources (including >> exportfs-admin) will not be run unless an instance of nfsserver is already >> running on that node. >> >> - There's a more fundamental question: Why are you placing the start/stop of >> your NFS server on both nodes under pacemaker control? Why not have the NFS >> server start at system startup on each node? >> >> The only reason I see for putting NFS under Pacemaker control is if there are >> entries in your /etc/exports file (or the Debian equivalent) that won't work >> unless other Pacemaker-controlled resources are running, such as DRBD. If >> that's >> the case, you're better off controlling them with Pacemaker exportfs >> resources, >> the same as you're doing with exportfs-admin, instead of /etc/exports >> entries. >> >>> Il giorno 14 aprile 2012 01:50, William Seligman< >>> [email protected]> ha scritto: >>> >>>> On 4/13/12 7:18 PM, William Seligman wrote: >>>>> On 4/13/12 6:42 PM, Seth Galitzer wrote: >>>>>> In attempting to build a nice clean config, I'm now in a state where >>>>>> exportfs never starts. It always times out and errors. >>>>>> >>>>>> crm config show is pasted here: http://pastebin.com/cKFFL0Xf >>>>>> syslog after an attempted restart here: http://pastebin.com/CHdF21M4 >>>>>> >>>>>> Only IPs have been edited. >>>>> >>>>> It's clear that your exportfs resource is timing out for the admin >>>> resource. >>>>> >>>>> I'm no expert, but here are some "stupid exportfs tricks" to try: >>>>> >>>>> - Check your /etc/exports file (or whatever the equivalent is in Debian; >>>>> "man exportfs" will tell you) on both nodes. Make sure you're not already >>>>> exporting the directory when the NFS server starts. >>>>> >>>>> - Take out the exportfs-admin resource. Then try doing things manually: >>>>> >>>>> # exportfs x.x.x.0/24:/exports/admin >>>>> >>>>> Assuming that works, then look at the output of just >>>>> >>>>> # exportfs >>>>> >>>>> The clientspec reported by exportfs has to match the clientspec you put >>>>> into the resource exactly. If exportfs is canonicalizing or reporting the >>>>> clientspec differently, the exportfs monitor won't work. If this is the >>>>> case, change the clientspec parameter in exportfs-admin to match. >>>>> >>>>> If the output of exportfs has any results that span more than one line, >>>>> then you've got the problem that the patch I referred you to (quoted >>>>> below) is supposed to fix. You'll have to apply the patch to your >>>>> exportfs resource. >>>> >>>> Wait a second; I completely forgot about this thread that I started: >>>> >>>> <http://www.gossamer-threads.com/lists/linuxha/users/78585> >>>> >>>> The solution turned out to be to remove the .rmtab files from the >>>> directories I was exporting, deleting& touching /var/lib/nfs/rmtab >>>> (you'll >>>> have to look up the Debian location), and adding rmtab_backup="none" to all >>>> my exportfs resources. >>>> >>>> Hopefully there's a solution for you in there somewhere! >>>> >>>>>> On 04/13/2012 01:51 PM, William Seligman wrote: >>>>>>> On 4/13/12 12:38 PM, Seth Galitzer wrote: >>>>>>>> I'm working through this howto doc: >>>>>>>> http://www.linbit.com/fileadmin/tech-guides/ha-nfs.pdf >>>>>>>> and am stuck at section 4.4. When I put the primary node in standby, >>>>>>>> it >>>>>>>> seems that NFS never releases the export, so it can't shut down, and >>>>>>>> thus can't get started on the secondary node. Everything up to that >>>>>>>> point in the doc works fine and fails over correctly. But once I add >>>>>>>> the exportfs resource, it fails. I'm running this on debian wheezy >>>>>>>> with >>>>>>>> the included standard packages, not custom. >>>>>>>> >>>>>>>> Any suggestions? I'd be happy to post configs and logs if requested. >>>>>>> >>>>>>> Yes, please post the output of "crm configure show", the output of >>>>>>> "exportfs" while the resource is running properly, and the relevant >>>>>>> sections of your log file. I suggest using pastebin.com, to keep >>>>>>> mailboxes filling up with walls of text. >>>>>>> >>>>>>> In case you haven't seen this thread already, you might want to take a >>>>>>> look: >>>>>>> >>>>>>> <http://www.gossamer-threads.com/lists/linuxha/dev/77166> >>>>>>> >>>>>>> And the resulting commit: >>>>>>> <https://github.com/ClusterLabs/resource-agents/commit/5b0bf96e77ed3c4e179c8b4c6a5ffd4709f8fdae> >>>>>>> >>>>>>> (Links courtesy of Lars Ellenberg.) >>>>>>> >>>>>>> The problem and patch discussed in those links doesn't quite match >>>>>>> what you describe. I mention it because I had to patch my exportfs >>>>>>> resource (in /usr/lib/ocf/resource.d/heartbeat/exportfs on my RHEL >>>>>>> systems) to get it to work properly in my setup. >> >> >> >> _______________________________________________ >> Linux-HA mailing list >> [email protected] >> http://lists.linux-ha.org/mailman/listinfo/linux-ha >> See also: http://linux-ha.org/ReportingProblems > -- Seth Galitzer Systems Coordinator Computing and Information Sciences Kansas State University http://www.cis.ksu.edu/~sgsax [email protected] 785-532-7790 _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
