05.11.2012 08:40, Andrew Beekhof wrote: > On Fri, Nov 2, 2012 at 6:22 PM, Vladislav Bogdanov <bub...@hoster-ok.com> > wrote: >> 02.11.2012 02:05, Andrew Beekhof wrote: >>> On Thu, Nov 1, 2012 at 5:09 PM, Vladislav Bogdanov <bub...@hoster-ok.com> >>> wrote: >>>> 01.11.2012 02:47, Andrew Beekhof wrote: >>>> ... >>>>>> >>>>>> One remark about that - it requires that gfs2 communicates with dlm in >>>>>> the kernel space - so gfs_controld is not longer required. I think >>>>>> Fedora 17 is the first version with that feature. And it is definitely >>>>>> not available for EL6 (centos6 which I use). >>>>>> >>>>>> But I have preliminary success running GFS2 with corosync2 and pacemaker >>>>>> 1.1.8 on EL6. dlm4 runs just fine as is (although it misses some >>>>>> featured on EL6 because of kernel). And it still includes (not >>>>>> documented) option enable_fscontrol, so user-space communication with fs >>>>>> control daemons is supported. Even it that feature will be removed >>>>>> upstream, it can be easily returned back - just several lines of code. >>>>>> And I ported gfs_controld from cman to corosync2 (patch is very dirty >>>>>> yet, made with scissors and needle, just a proof-of-concept that it even >>>>>> can work). Some features are unsupported (f.e. nodir) and will not be >>>>>> implemented by me. >>>>> >>>>> I'm impressed. What was the motivation though? You really really >>>>> don't like CMAN? :-) >>>> >>>> Why should I like software which is going to die? ;) >>>> >>>> I believe that how things are done currently (third case from your list) >>>> fully reflect my "perfectionistic" needs. I had many problems with >>>> cman+pacemaker in a past. Most critical is that pacemaker and >>>> dlm_controld react differently when node reappears back very soon after >>>> if was lost (because pacemaker uses totem ? directly for membership, but >>>> dlm uses CPG). >>> >>> We both get it from the CPG and quorum APIs for option 3. >> >> Yes, but not for 1 nor for 2. > > Not quite. We used to ignore it for option 2, but not anymore. > Option 2 uses CPG for messaging. > >> I saw described behavior with both of >> them, but not with 3. >> That's why I decided to go with 3 which I think conceptually right. >> >>> >>>> Pacemaker accepts that, but controld freezes lockspaces, >>>> waiting for fencing. But fencing is never done because nobody handles >>>> "node lost" CPG event. >>> >>> WTF. Pacemaker should absolutely do this. Bug report? >> >> Sorry for being unclear. >> I saw that with both 1 and 2 (where pacemaker did not use CPG), until I >> "fixed" fencing at dlm layer for 1. I modified it to request fencing if >> "node down" event occurs and then did not see freezes anymore. From what >> I understand, "node down" CPG event occurs when corosync forms >> transitional membership (at least pacemaker logged lines about that at >> the same time with dlm freeze. And if stable membership occurs >> (milli-)seconds after transitional one, pacemaker (as of probable 1.1.6) >> did not fence re-appeared node. I can understand that - pacemaker can >> absolutely live with that. But dlm cannot. > > Right. Any sort of membership hiccup is fatal as far as the dlm is concerned. > But even with options 1 and 2, it should still make a fencing request.
I'm afraid no. At least not with 3.0.17 or 3.1.7. Sources are clear about that - CPG node down event does not result in fencing requested by dlm_controld. And that was a major problem for me with options 1 and 2. One-line patch solved that though. But I decided that cman is a no-go for me anymore because such critical issues as proper fencing should be tested thoroughly and if they are not, then I will feel like sitting on a bomb with it. > > Without fence_pcmk in cluster.conf that request might have gotten > lost, but with 1.1.8 I would expect the node to be shot - regardless > of whether the rest of Pacemaker thought it was ok. > Thats why going direct to stonithd was an important change. Aha. I tried cman last time before fence_pcmk was written (and before that fencing call dlm_controld.pcmk uses was modified to go straight to stonithd). I recall I was polishing option 1 that time (after throwing cman away), and first revision of that move did not work because it used async libstonithd call to fence a node. That's why I used direct calls to stonith in my version of dlm_controld.pcmk. All that resulted in fully-working stack and I decided to go with option 3 only after hearing from you that you do not test pacemaker with corosync1 yourselves anymore. That was second major problem with option 1 - before all that changes there was a possibility for fencing request to be dropped silently. And I actually hit that. I do not know if it fully works with stock 3.0.17 dlm_controld.pcmk (I suspect no because of issue 1) but with my builds it is stable. Anyways, I seem to be happy with option 3 on EL6, it introduces clean and straight-forward model of cluster stack and it works perfectly, so I do not see any reasons to return back to option 1 or 2. > >> And it is its task to do >> proper fencing in case it cannot work, not pacemaker's. But that piece >> was missing there. The same is (probably, I may be damn wrong here) true >> for cman - I did a quick search for a CPG "node down" handler in its >> sources but didn't find one. I suspect it was handled by some deprecated >> daemon (f.e. groupd) in the past, but as of 3.1.7 I did not observe >> handling for that. >> >> As I go with option 3, I should not see that anymore even theoretically. >> >> So no bug report for what I wont use anymore :) >> >>> >>>> dlm does start fencing for "process lost", but >>>> not for "node lost". >>>> >>>> >>>> >>>> _______________________________________________ >>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>> >>>> Project Home: http://www.clusterlabs.org >>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>> Bugs: http://bugs.clusterlabs.org >>> >>> _______________________________________________ >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org >>> >> >> >> _______________________________________________ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org