Re: [Pacemaker] Migration atomicity

2012-03-14 Thread Vladislav Bogdanov
15.03.2012 01:49, Andreas Kurz wrote: > On 03/14/2012 08:40 AM, Vladislav Bogdanov wrote: >> Hi, >> >> I'm observing a little bit unintuitive behavior of migration logic when >> transition is aborted (due to CIB change) in the middle of the resource >> migration. >> >> That is: >> 1. nodea: migrate

Re: [Pacemaker] Migration atomicity

2012-03-14 Thread Lars Marowsky-Bree
On 2012-03-14T23:49:26, Andreas Kurz wrote: > > Is the current behavior intended? > You mean that a migration is rolled-back due to a transition abort -- > depending on its progress? I think that is the defined (and intended) > behavior since quite a long time ... maybe Andrew likes to comment on

Re: [Pacemaker] Migration atomicity

2012-03-14 Thread Andreas Kurz
On 03/14/2012 08:40 AM, Vladislav Bogdanov wrote: > Hi, > > I'm observing a little bit unintuitive behavior of migration logic when > transition is aborted (due to CIB change) in the middle of the resource > migration. > > That is: > 1. nodea: migrate_to nodeb > 2. transition abort > 3. nodeb: st

Re: [Pacemaker] pacemaker + rhel6 and clvmd

2012-03-14 Thread Andreas Kurz
On 03/14/2012 01:05 PM, David Coulson wrote: > Are you running 'real' RHEL6? > > If so, cman + clvmd + gfs2 is the way to go. You can run pacemaker on > top of all of that (without openais) to manage your resources if you > don't want to use rgmanager. > > I've never tried to run clvmd out of pac

Re: [Pacemaker] getting started - crm hangs when adding resources, even "crm ra classes" hangs

2012-03-14 Thread Andreas Kurz
On 03/14/2012 05:55 PM, Phillip Frost wrote: > On Mar 14, 2012, at 12:33 PM, Florian Haas wrote: > >>> However, sometimes pacemakerd will not stop cleanly. >> >> OK. Whether this is related to your original problem or not a complete >> open question, jftr. >> >>> I thought it might happen when sto

Re: [Pacemaker] Nodes unable to connect / find each other

2012-03-14 Thread Andreas Kurz
On 03/14/2012 08:22 PM, mark - pacemaker list wrote: > Hi, > > On Wed, Mar 14, 2012 at 1:43 PM, Regendoerp, Achim > mailto:achim.regendo...@galacoral.com>> > wrote: > > Hi, > > __ __ > > Below is a cut out from the tcpdump run on both boxes. The tcpdump > is the same on both

Re: [Pacemaker] unbound resource agent

2012-03-14 Thread Arnold Krille
On Wednesday 14 March 2012 17:52:21 Dejan Muhamedagic wrote: > On Wed, Mar 14, 2012 at 02:48:11PM +0100, Benjamin Kiessling wrote: > > Hi, > > > > On 2012.03.14 14:24:10 +0100, Dejan Muhamedagic wrote: > > > > dnsCache_start_0 (node=router1, call=56, rc=-2, status=Timed Out): > > > > unknown exec

Re: [Pacemaker] unbound resource agent

2012-03-14 Thread Benjamin Kiessling
Hiho, On 2012.03.14 17:52:21 +0100, Dejan Muhamedagic wrote: > This one exited with a generic error. Didn't notice that. The RA > should've logged the reason. Will fix that. > Negative exit codes are special and cannot be produced by a > script. Good to know. > Well, let's just say that it may

Re: [Pacemaker] Nodes unable to connect / find each other

2012-03-14 Thread mark - pacemaker list
Hi, On Wed, Mar 14, 2012 at 1:43 PM, Regendoerp, Achim < achim.regendo...@galacoral.com> wrote: > Hi, > > ** ** > > Below is a cut out from the tcpdump run on both boxes. The tcpdump is the > same on both boxes. > > The traffic only appears if I set the bindnetaddr in > /etc/corosync/cor

Re: [Pacemaker] getting started - crm hangs when adding resources, even "crm ra classes" hangs

2012-03-14 Thread Phillip Frost
On Mar 14, 2012, at 12:33 PM, Florian Haas wrote: >> However, sometimes pacemakerd will not stop cleanly. > > OK. Whether this is related to your original problem or not a complete > open question, jftr. > >> I thought it might happen when stopping pacemaker on the current DC, but >> after succ

Re: [Pacemaker] unbound resource agent

2012-03-14 Thread Dejan Muhamedagic
On Wed, Mar 14, 2012 at 02:48:11PM +0100, Benjamin Kiessling wrote: > Hi, > > On 2012.03.14 14:24:10 +0100, Dejan Muhamedagic wrote: > > > dnsCache_start_0 (node=router1, call=56, rc=-2, status=Timed Out): > > > unknown exec error > > > dnsCache_monitor_1000 (node=router2, call=24, rc=1, status=c

Re: [Pacemaker] getting started - crm hangs when adding resources, even "crm ra classes" hangs

2012-03-14 Thread Florian Haas
On Wed, Mar 14, 2012 at 4:58 PM, Phillip Frost wrote: >> Can you confirm that you're running the ~bpo60+2 (note trailing "2") >> build, that you're actually running an lrmd binary from that version >> (meaning: that you properly killed your lrmd prior to installing that >> package), _and_ that "lr

Re: [Pacemaker] getting started - crm hangs when adding resources, even "crm ra classes" hangs

2012-03-14 Thread Phillip Frost
On Mar 14, 2012, at 9:45 AM, Florian Haas wrote: >>> The current cluster-glue package in squeeze-backports, >>> cluster-glue_1.0.9+hg2665-1~bpo60+2, has upstart disabled. >>> Double-check that you're running that version. If you do, and the >>> issue persists, please let us know. >> >> Indeed, tha

Re: [Pacemaker] unbound resource agent

2012-03-14 Thread Benjamin Kiessling
Hi, On 2012.03.14 14:24:10 +0100, Dejan Muhamedagic wrote: > > dnsCache_start_0 (node=router1, call=56, rc=-2, status=Timed Out): unknown > > exec error > > dnsCache_monitor_1000 (node=router2, call=24, rc=1, status=complete): > > unknown error > > dnsCache_start_0 (node=router2, call=81, rc=-2,

Re: [Pacemaker] getting started - crm hangs when adding resources, even "crm ra classes" hangs

2012-03-14 Thread Florian Haas
On Wed, Mar 14, 2012 at 2:37 PM, Phillip Frost wrote: > On Mar 14, 2012, at 9:25 AM, Florian Haas wrote: > >>> Do you have upstart at all? In that case, the debian package >>> shouldn't have the upstart enabled when building cluster-glue. >> >> The current cluster-glue package in squeeze-backports

Re: [Pacemaker] getting started - crm hangs when adding resources, even "crm ra classes" hangs

2012-03-14 Thread Phillip Frost
On Mar 14, 2012, at 9:25 AM, Florian Haas wrote: >> Do you have upstart at all? In that case, the debian package >> shouldn't have the upstart enabled when building cluster-glue. > > The current cluster-glue package in squeeze-backports, > cluster-glue_1.0.9+hg2665-1~bpo60+2, has upstart disabled

Re: [Pacemaker] getting started - crm hangs when adding resources, even "crm ra classes" hangs

2012-03-14 Thread Florian Haas
On Wed, Mar 14, 2012 at 2:16 PM, Dejan Muhamedagic wrote: > Hi, > > On Tue, Mar 13, 2012 at 05:59:35PM -0400, Phillip Frost wrote: >> On Mar 13, 2012, at 2:21 PM, Jake Smith wrote: >> >> >> From: "Phillip Frost" >> >> Subject: [Pacemaker] getting started - crm hangs when adding resources,   >> >

Re: [Pacemaker] unbound resource agent

2012-03-14 Thread Dejan Muhamedagic
Hi, On Wed, Mar 14, 2012 at 01:37:39PM +0100, Benjamin Kiessling wrote: > Hi, > > I've written a resource agent for the unbound DNS server, based on the > named resource agent. It is available at [0]. Unfortunately there seems > to be a bug I can't figure out. The failcounters seem to increase ov

Re: [Pacemaker] pacemaker + rhel6 and clvmd

2012-03-14 Thread emmanuel segura
I'm running Suse 11 sp1 Il giorno 14 marzo 2012 13:05, David Coulson ha scritto: > Are you running 'real' RHEL6? > > If so, cman + clvmd + gfs2 is the way to go. You can run pacemaker on top > of all of that (without openais) to manage your resources if you don't want > to use rgmanager. > > I'v

Re: [Pacemaker] getting started - crm hangs when adding resources, even "crm ra classes" hangs

2012-03-14 Thread Dejan Muhamedagic
Hi, On Tue, Mar 13, 2012 at 05:59:35PM -0400, Phillip Frost wrote: > On Mar 13, 2012, at 2:21 PM, Jake Smith wrote: > > >> From: "Phillip Frost" > >> Subject: [Pacemaker] getting started - crm hangs when adding resources, > >> even "crm ra classes" hangs > >> > >> more interestingly, even "

[Pacemaker] unbound resource agent

2012-03-14 Thread Benjamin Kiessling
Hi, I've written a resource agent for the unbound DNS server, based on the named resource agent. It is available at [0]. Unfortunately there seems to be a bug I can't figure out. The failcounters seem to increase over the course of several days until pacemaker refuses to start the resource anywher

Re: [Pacemaker] pacemaker + rhel6 and clvmd

2012-03-14 Thread David Coulson
Are you running 'real' RHEL6? If so, cman + clvmd + gfs2 is the way to go. You can run pacemaker on top of all of that (without openais) to manage your resources if you don't want to use rgmanager. I've never tried to run clvmd out of pacemaker, but there is a init.d script for it in RHEL6,

Re: [Pacemaker] pacemaker + rhel6 and clvmd

2012-03-14 Thread Mark Frasa
Hmm: ls -l /usr/lib/ocf/resource.d/lvm2/clvmd ls: cannot access /usr/lib/ocf/resource.d/lvm2/clvmd: No such file or directory [root@node0 ~]# yum whatprovides /usr/lib/ocf/resource.d/lvm2/clvmd Loaded plugins: product-id, rhnplugin, subscription-manager Updating certificate-based repositories. No

Re: [Pacemaker] pacemaker + rhel6 and clvmd

2012-03-14 Thread emmanuel segura
Verify if you have the resource ls -l /usr/lib/ocf/resource.d/lvm2/clvmd Il giorno 14 marzo 2012 12:15, Mark Frasa ha scritto: > Hello, > > We are trying to build 3-node cluster with shared-storage (FC-SAN). On > this cluster we need clustered-lvm2 to provide storage to our vm's. > Now we have

Re: [Pacemaker] pacemaker + rhel6 and clvmd

2012-03-14 Thread Mark Frasa
We are using corosync+pacemaker. On Wed, Mar 14, 2012 at 12:25 PM, emmanuel segura wrote: > Are you using cman+pacemaker or corosync+pacemaker? > > Il giorno 14 marzo 2012 12:15, Mark Frasa ha scritto: >> >> Hello, >> >> We are trying to build 3-node cluster with shared-storage (FC-SAN). On >> t

Re: [Pacemaker] pacemaker + rhel6 and clvmd

2012-03-14 Thread emmanuel segura
Are you using cman+pacemaker or corosync+pacemaker? Il giorno 14 marzo 2012 12:15, Mark Frasa ha scritto: > Hello, > > We are trying to build 3-node cluster with shared-storage (FC-SAN). On > this cluster we need clustered-lvm2 to provide storage to our vm's. > Now we have seems a few tutorials

[Pacemaker] pacemaker + rhel6 and clvmd

2012-03-14 Thread Mark Frasa
Hello, We are trying to build 3-node cluster with shared-storage (FC-SAN). On this cluster we need clustered-lvm2 to provide storage to our vm's. Now we have seems a few tutorials on howto setup clvmd and all of them are trying to include ocfs2 (which i guess we don't have on RHEL6): configure DL

[Pacemaker] [PATCH] pingd checks pidfile on start

2012-03-14 Thread Takatoshi MATSUO
Hi I use pacemaker 1.0.11 and pingd RA. Occasionally, pingd's first monitor is failed after start. It seems that the main cause is pingd daemon returns 0 before creating pidfile and RA doesn't check pidfile on start. test script - while true; do killall pi

[Pacemaker] Migration atomicity

2012-03-14 Thread Vladislav Bogdanov
Hi, I'm observing a little bit unintuitive behavior of migration logic when transition is aborted (due to CIB change) in the middle of the resource migration. That is: 1. nodea: migrate_to nodeb 2. transition abort 3. nodeb: stop 4. nodea: migrate_to nodec 5. nodec: migrate_from nodea (note: no s