Re: [Linux-HA] Problem promoting slave to master

Fredrik Hudner Thu, 14 Mar 2013 07:53:13 -0700

I set no-quorum-policy to ignore and removed the constraint you mentioned.
It then managed to failover once to the slave node, but I still have those.


Failed actions:

     p_exportfs_root:0_monitor_
>
> 30000 (node=testclu01, call=12, rc=7,
>   status=complete): not running
>
>      p_exportfs_root:1_monitor_30000 (node=testclu02, call=12, rc=7,
>   status=complete): not running

I then stoped the new maste-node to see if it fell over to the other node
with no success.. It remains slave.
I also noticed that the constraint drbd-fence-by-handler-nfs-ms_drbd_nfs
was back in the crm configure. Seems like cib makes a replace
Mar 14 15:06:18 [1786] tdtestclu02       crmd:     info:
abort_transition_graph:        te_update_diff:126 - Triggered transition
abort (complete=1, tag=diff, id=(null), magic=NA, cib=0.781.1) : Non-status
change
Mar 14 15:06:18 [1786] tdtestclu02       crmd:   notice:
do_state_transition:   State transition S_IDLE -> S_POLICY_ENGINE [
input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ]
Mar 14 15:06:18 [1781] tdtestclu02        cib:     info:
cib_replace_notify:    Replaced: 0.780.39 -> 0.781.1 from tdtestclu01

So not sure how to remove that constraint on a permanent basis.. it's gone
as long as I don't stop pacemaker.

But it used to work booth with the no-quorom-policy=freeze and that
constraint

Kind regards
/Fredrik



On Thu, Mar 14, 2013 at 2:49 PM, Andreas Kurz <[email protected]> wrote:

> On 2013-03-14 13:30, Fredrik Hudner wrote:
> > Hi all,
> >
> > I have a problem after I removed a node with the force command from my
> crm
> > config.
> >
> > Originally I had 2 nodes running HA cluster (corosync 1.4.1-7.el6,
> > pacemaker 1.1.7-6.el6)
> >
> >
> >
> > Then I wanted to add a third node acting as quorum node, but was not able
> > to get it to work (probably because I don’t understand how to set it up).
> >
> > So I removed the 3rd node, but had to use the force command as crm
> > complained when I tried to remove it.
> >
> >
> >
> > Now when I start up Pacemaker the resources doesn’t look like they come
> up
> > correctly
> >
> >
> >
> > Online: [ testclu01 testclu02 ]
> >
> >
> >
> > Master/Slave Set: ms_drbd_nfs [p_drbd_nfs]
> >
> >      Masters: [ testclu01 ]
> >
> >      Slaves: [ testclu02 ]
> >
> > Clone Set: cl_lsb_nfsserver [p_lsb_nfsserver]
> >
> >      Started: [ tdtestclu01 tdtestclu02 ]
> >
> > Resource Group: g_nfs
> >
> >      p_lvm_nfs  (ocf::heartbeat:LVM):   Started testclu01
> >
> >      p_fs_shared        (ocf::heartbeat:Filesystem):    Started testclu01
> >
> >      p_fs_shared2       (ocf::heartbeat:Filesystem):    Started testclu01
> >
> >      p_ip_nfs   (ocf::heartbeat:IPaddr2):       Started testclu01
> >
> > Clone Set: cl_exportfs_root [p_exportfs_root]
> >
> >      Started: [ testclu01 testclu02 ]
> >
> >
> >
> > Failed actions:
> >
> >     p_exportfs_root:0_monitor_30000 (node=testclu01, call=12, rc=7,
> > status=complete): not running
> >
> >     p_exportfs_root:1_monitor_30000 (node=testclu02, call=12, rc=7,
> > status=complete): not running
> >
> >
> >
> > The filesystems mount correctly on the master at this stage and can be
> > written to.
> >
> > When I stop the services on the master node for it to failover, it
> doesn’t
> > work.. Looses cluster-ip connectivity
>
> fix your "no-quorum-policy", you want to "ignore" the quorum in a
> two-node cluster to allow failover ... and if your drbd device is
> already in sync, remove that drbd-fence-by-handler-nfs-ms_drbd_nfs
> constraint.
>
> Regards,
> Andreas
>
> --
> Need help with Pacemaker?
> http://www.hastexo.com/now
>
> >
> >
> >
> > Corosync.log from master after I stopped pacemaker on master node :  see
> > attached file
> >
> >
> >
> > Additional files (attached): crm-configure show
> >
> >                                                           Corosync.conf
> >
> >
> Global_common.conf
> >
> >
> >
> >
> >
> > I’m not sure how to proceed to get it up in a fair state now
> >
> > So if anyone could help me it would be much appreciated
> >
> >
> >
> > Kind regards
> >
> > /Fredrik
> >
> >
> >
> > _______________________________________________
> > Linux-HA mailing list
> > [email protected]
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
> >
>
>
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Problem promoting slave to master

Reply via email to