Re: [Pacemaker] IPaddr2 not failing-over

2010-09-05 Thread Andrew Beekhof
On Thu, Sep 2, 2010 at 11:44 PM, Ron Kerry wrote: > So the lrm is obviously issuing the cancel ... but why? Almost certainly the PE told it to. The logs should include some lines matching the regex: pengine:.*bz2 If you attach the files referenced by those the lines we'll be able to figure out wh

Re: [Pacemaker] [PING] ping, pingd and CIB updates, pick your poison :)

2010-09-05 Thread Andrew Beekhof
On Thu, Sep 2, 2010 at 10:50 AM, Thomas Guthmann wrote: > Hi Andrew, > > First thanks for remembering my issue and looking into it :) > >> Jul 30 11:37:50 [..] > > Yes but... See the time line pasted below. (at 11:37, it starts to do > something) > >>> 11:20AM : cluster is up and running >>> 11:25

Re: [Pacemaker] SBD Stonith Configuration

2010-09-05 Thread Andrew Beekhof
On Wed, Sep 1, 2010 at 3:05 PM, Rainer wrote: > edit: > > Also found this in the Logs: > > stonith-ng: [18851]: ERROR: crm_abort: remote_op_done: Triggered assert at > remote.c:134 : op->request != NULL > > > Time for the Support Request i think... Yes. Novell has a fix for this. It may even be

Re: [Pacemaker] cib fails to start until host is rebooted

2010-09-05 Thread Andrew Beekhof
On Thu, Sep 2, 2010 at 2:18 PM, Michael Smith wrote: > On Thu, 2 Sep 2010, Andrew Beekhof wrote: > >> On Mon, Aug 30, 2010 at 10:04 PM, Michael Smith wrote: >> > Hi, >> > >> > I have a pacemaker/corosync setup on a bunch of fully patched SLES11 SP1 >> > systems. On one of the systems, if I /etc/i

Re: [Pacemaker] Node doesn't rejoin automatically after reboot

2010-09-05 Thread Andrew Beekhof
On Mon, Sep 6, 2010 at 7:57 AM, Tom Tux wrote: > No, I don't have such failed-messages. In my case, the "Connection to > our AIS plugin" was established. > > The /dev/shm is also not full. Is corosync running? > Kind regards, > Tom > > 2010/9/3 Michael Smith : >> Tom Tux wrote: >> >>> If I disjo

Re: [Pacemaker] Node doesn't rejoin automatically after reboot

2010-09-05 Thread Tom Tux
No, I don't have such failed-messages. In my case, the "Connection to our AIS plugin" was established. The /dev/shm is also not full. Kind regards, Tom 2010/9/3 Michael Smith : > Tom Tux wrote: > >> If I disjoin one clusternode (node01) for maintenance-purposes >> (/etc/init.d/openais stop) and r