On Tue, Oct 30, 2012 at 6:15 PM, Vladislav Bogdanov <bub...@hoster-ok.com> wrote: > 29.10.2012 19:51, Bernardo Cabezas Serra wrote: >> Hello, >> >> disclaimer: I have posted this issue to linux-ha list too a couple of >> days ago. I'm sorry if this is not the correct list, and thanks if you >> can give me a hint about which cluster stack should I use for ocfs2 by now. >> >> I'm trying to compile all stack for corosync + pacemaker + dlm + ocfs2 >> (with dlm_controld.pmk), without cman stack. I'm following the "From >> source" Pacemaker guide. >> >> After some days trying to compile the correct combination of >> sources/versions, I have no success, and I'm not sure if at this moment >> this is possible. >> >> The fist problem was that cluster removed support for dlm_controld with >> pacemaker stack. Last version with support was 3.0.17. >> But this was done some years ago, and as far as I have been able to >> understand, things are still broken. >> >> >> The most relevant info found about this issue are these threads from >> Andrew Beekhof and Vladislav Bogdanov, wich suggest to compile >> dlm_controld from Cluster, applying some patches. They report it worked >> (whith some remaining issues): >> >> http://oss.clusterlabs.org/pipermail/pacemaker/2009-October/003064.html >> http://www.mail-archive.com/pacemaker@oss.clusterlabs.org/msg09959.html >> >> But most recent issue about this is a year ago, and seems that things >> are still broken. >> I haven't been able to compile, with lots of errors, so I'm currently >> asking if this is the right way, becouse seems that nobody else is >> willing to use this... > > I still run that on two clusters with post-1.1.7 pacemaker, but not with > 1.1.8. Just looked, it is fe859a7 (Apr 11, two weeks after 1.1.7) with > two dozens of cherry-picked patches. I should migrate them to > corosync2/pacemaker-1.1.8 shortly. > > What exactly errors do you see? Pacemaker APIs used there received some > changes between 1.1.7 and 1.1.8. I have one more patch which I tried > with pacemaker master Aug 22 (close enough to 1.1.8, but some APIs > changed again after that point). That version did not work for me with > corosync14 because of bug fixed after that and I decided to move to > corosync2 right after that failure to be more upstream-compatible. I > can't say if it help you, but you may want to try. Should I post it? > > Main issue with "pcmk" version of all that daemons is that fs control > daemons and dlm_controld require ability to request fencing. Originally > it was done by calling some high-level pacemaker APIs > (crm_terminate_member_no_mainloop()) and that did not work in some > circumstances. > Andrew developed brand-new stonith-ng subsystem which is used for > fencing in that version of dlm_controld you talk about (with my patches > on top of Andrew's patches).
crm_terminate_member_no_mainloop() should still work though (specifically because I knew the old controld's used it). You just need the (new) compatibility header and the result will be /very/ reliable - there is no crmd/pengine involvement anymore, you go straight to the fencing daemon. > I suspect that ocfs_controld.pcmk (like gfs_controld.pcmk in 3.0.17) > still uses that old way. If that is true, then it can't work reliably. I > tried to port gfs_controld to use stonith-ng with corosync1/openais > (included in the patch I talk about above), but I did not test it at all > (although I just ported it to corosync2/dlm4 and it works in a testing > setup, see my answer to David). > > Vladislav > >> >> >> At cluster page, they state that now DLM code has been separated from >> cluster: >> https://fedorahosted.org/cluster/wiki/HomePage >> >> But this dlm project (that seems to have pcmk support), depends on >> corosync 2.0, so it can't run with last pacemaker (1.1.8). (can it?) >> http://git.fedorahosted.org/git/dlm.git >> >> Before spending more time with this, I wanted to ask for the right way >> to do things. >> So Questions are: >> >> (1) Is it by now possible to get an ocfs2 corosync + pacemaker cluster, >> without cman, and dlm_controld with pcmk stack? (if yes which >> repos/versions)? >> (2) What is the future roadmap about this? Will future corosync2.0 >> cluster have dlm issues addressed? >> >> Also, I have read (also Andrew post) that OCFS2 cluster could have >> problems on top of corosync 2.0, as OCFS2 has't ben ported (GFS2 was >> ported). >> http://www.gossamer-threads.com/lists/linuxha/pacemaker/78538 >> so: >> (3) Is GSF2 a better future option in terms of support, for linux-ha >> clusters? >> >> >> More details about pcmk dlm_controld: >> I found that Suse have always been mantaining cman-free cluster stack, >> so I have tried to find dlm in its packages. >> Found: >> http://rpmfind.net//linux/RPM/opensuse/factory/x86_64/libdlm-3.00.01-24.5.x86_64.html >> >> >> But also I have had lots of compilation problems, trying several >> pacemaker, versions, also the suse-patched ones. Haven't been able to >> successfully complie a dlm_controld. >> >> >> Thanks and Regards, >> Bernardo >> -- >> APSL >> APSL >> *Bernardo Cabezas Serra* >> *Responsable Sistemas* >> Ada Byron, edificio NTIC 2ÂșA >> 07121 ParcBit >> Mail: bcabe...@apsl.net >> Skype: bernat.cabezas >> Tel: 971439771 >> >> >> >> _______________________________________________ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org >> > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org