Re: [Linux-HA] pacemaker/heartbeat LVM

Marlon Guao Mon, 29 Dec 2014 02:02:53 -0800

Hi,

ah yeah.. tried to poweroff the active node.. and tried pvscan on the
passive.. and yes.. it didn't worked --- it doesn't return to the shell.
So, the problem is on DLM?


On Mon, Dec 29, 2014 at 5:51 PM, emmanuel segura <[email protected]> wrote:

> Power off the active node and after one seconde try to use one lvm
> command, for example pvscan, if this command doesn't response is
> because dlm relay on cluster fencing, if the cluster fencing doesn't
> work the dlm state in blocked state.
>
> 2014-12-29 10:43 GMT+01:00 Marlon Guao <[email protected]>:
> > perhaps, we need to focus on this message. as mentioned.. the cluster is
> > working fine under normal circumstances. my only concern is that, LVM
> > resource agent doesn't try to re-activate the VG on the passive node when
> > the active node goes down ungracefully (powered off). Hence, it could not
> > mount the filesystems.. etc.
> >
> >
> > Dec 29 17:12:26 s1 crmd[1495]:   notice: process_lrm_event: Operation
> > sbd_monitor_0: not running (node=
> > s1, call=5, rc=7, cib-update=35, confirmed=true)
> > Dec 29 17:12:26 s1 crmd[1495]:   notice: te_rsc_command: Initiating
> action
> > 13: monitor dlm:0_monitor_0
> > on s2
> > Dec 29 17:12:26 s1 crmd[1495]:   notice: te_rsc_command: Initiating
> action
> > 5: monitor dlm:1_monitor_0 o
> > n s1 (local)
> > Dec 29 17:12:26 s1 crmd[1495]:   notice: process_lrm_event: Operation
> > dlm_monitor_0: not running (node=
> > s1, call=10, rc=7, cib-update=36, confirmed=true)
> > Dec 29 17:12:26 s1 crmd[1495]:   notice: te_rsc_command: Initiating
> action
> > 14: monitor clvm:0_monitor_0
> >  on s2
> > Dec 29 17:12:26 s1 crmd[1495]:   notice: te_rsc_command: Initiating
> action
> > 6: monitor clvm:1_monitor_0
> > on s1 (local)
> > Dec 29 17:12:26 s1 crmd[1495]:   notice: process_lrm_event: Operation
> > clvm_monitor_0: not running (node
> > =s1, call=15, rc=7, cib-update=37, confirmed=true)
> > Dec 29 17:12:26 s1 crmd[1495]:   notice: te_rsc_command: Initiating
> action
> > 15: monitor cluIP_monitor_0
> > on s2
> > Dec 29 17:12:26 s1 crmd[1495]:   notice: te_rsc_command: Initiating
> action
> > 7: monitor cluIP_monitor_0 o
> > n s1 (local)
> > Dec 29 17:12:26 s1 crmd[1495]:   notice: process_lrm_event: Operation
> > cluIP_monitor_0: not running (nod
> > e=s1, call=19, rc=7, cib-update=38, confirmed=true)
> > Dec 29 17:12:26 s1 crmd[1495]:   notice: te_rsc_command: Initiating
> action
> > 16: monitor vg1_monitor_0 on
> >  s2
> > Dec 29 17:12:26 s1 crmd[1495]:   notice: te_rsc_command: Initiating
> action
> > 8: monitor vg1_monitor_0 on
> > s1 (local)
> > Dec 29 17:12:26 s1 LVM(vg1)[1583]: WARNING: LVM Volume cluvg1 is not
> > available (stopped)
> > Dec 29 17:12:26 s1 crmd[1495]:   notice: process_lrm_event: Operation
> > vg1_monitor_0: not running (node=
> > s1, call=23, rc=7, cib-update=39, confirmed=true)
> > Dec 29 17:12:26 s1 crmd[1495]:   notice: te_rsc_command: Initiating
> action
> > 17: monitor fs1_monitor_0 on
> >  s2
> > Dec 29 17:12:26 s1 crmd[1495]:   notice: te_rsc_command: Initiating
> action
> > 9: monitor fs1_monitor_0 on
> > s1 (local)
> > Dec 29 17:12:26 s1 Filesystem(fs1)[1600]: WARNING: Couldn't find device
> > [/dev/mapper/cluvg1-clulv1]. Ex
> > pected /dev/??? to exist
> > Dec 29 17:12:26 s1 crmd[1495]:   notice: process_lrm_event: Operation
> > fs1_monitor_0: not running (node=
> > s1, call=27, rc=7, cib-update=40, confirmed=true)
> >
> > On Mon, Dec 29, 2014 at 5:38 PM, emmanuel segura <[email protected]>
> wrote:
> >
> >> Dec 27 15:38:00 s1 cib[1514]:    error: crm_xml_err: XML Error:
> >> Permission deniedPermission deniedI/O warning : failed to load
> >> external entity "/var/lib/pacemaker/cib/cib.xml"
> >> Dec 27 15:38:00 s1 cib[1514]:    error: write_cib_contents: Cannot
> >> link /var/lib/pacemaker/cib/cib.xml to
> >> /var/lib/pacemaker/cib/cib-0.raw: Operation not permitted (1)
> >>
> >> 2014-12-29 10:33 GMT+01:00 emmanuel segura <[email protected]>:
> >> > Hi,
> >> >
> >> > You have  a problem with the cluster stonithd:"error: crm_abort:
> >> > crm_glib_handler: Forked child 6186 to record non-fatal assert at
> >> > logging.c:73 "
> >> >
> >> > Try to post your cluster version(packages), maybe someone can tell you
> >> > if this is a known bug or new.
> >> >
> >> >
> >> >
> >> > 2014-12-29 10:29 GMT+01:00 Marlon Guao <[email protected]>:
> >> >> ok, sorry for that.. please use this instead.
> >> >>
> >> >> http://pastebin.centos.org/14771/
> >> >>
> >> >> thanks.
> >> >>
> >> >> On Mon, Dec 29, 2014 at 5:25 PM, emmanuel segura <[email protected]
> >
> >> wrote:
> >> >>
> >> >>> Sorry,
> >> >>>
> >> >>> But your paste is empty.
> >> >>>
> >> >>> 2014-12-29 10:19 GMT+01:00 Marlon Guao <[email protected]>:
> >> >>> > hi,
> >> >>> >
> >> >>> > uploaded it here.
> >> >>> >
> >> >>> > http://susepaste.org/45413433
> >> >>> >
> >> >>> > thanks.
> >> >>> >
> >> >>> > On Mon, Dec 29, 2014 at 5:09 PM, Marlon Guao <
> [email protected]>
> >> >>> wrote:
> >> >>> >
> >> >>> >> Ok, i attached the log file of one of the nodes.
> >> >>> >>
> >> >>> >> On Mon, Dec 29, 2014 at 4:42 PM, emmanuel segura <
> >> [email protected]>
> >> >>> >> wrote:
> >> >>> >>
> >> >>> >>> please use pastebin and show your whole logs
> >> >>> >>>
> >> >>> >>> 2014-12-29 9:06 GMT+01:00 Marlon Guao <[email protected]>:
> >> >>> >>> > by the way.. just to note that.. for a normal testing (manual
> >> >>> failover,
> >> >>> >>> > rebooting the active node)... the cluster is working fine. I
> only
> >> >>> >>> encounter
> >> >>> >>> > this error if I try to poweroff/shutoff the active node.
> >> >>> >>> >
> >> >>> >>> > On Mon, Dec 29, 2014 at 4:05 PM, Marlon Guao <
> >> [email protected]>
> >> >>> >>> wrote:
> >> >>> >>> >
> >> >>> >>> >> Hi.
> >> >>> >>> >>
> >> >>> >>> >>
> >> >>> >>> >> Dec 29 13:47:16 s1 LVM(vg1)[1601]: WARNING: LVM Volume cluvg1
> >> is not
> >> >>> >>> >> available (stopped)
> >> >>> >>> >> Dec 29 13:47:16 s1 crmd[1515]:   notice: process_lrm_event:
> >> >>> Operation
> >> >>> >>> >> vg1_monitor_0: not running (node=
> >> >>> >>> >> s1, call=23, rc=7, cib-update=40, confirmed=true)
> >> >>> >>> >> Dec 29 13:47:16 s1 crmd[1515]:   notice: te_rsc_command:
> >> Initiating
> >> >>> >>> action
> >> >>> >>> >> 9: monitor fs1_monitor_0 on
> >> >>> >>> >> s1 (local)
> >> >>> >>> >> Dec 29 13:47:16 s1 crmd[1515]:   notice: te_rsc_command:
> >> Initiating
> >> >>> >>> action
> >> >>> >>> >> 16: monitor vg1_monitor_0 on
> >> >>> >>> >>  s2
> >> >>> >>> >> Dec 29 13:47:16 s1 Filesystem(fs1)[1618]: WARNING: Couldn't
> find
> >> >>> device
> >> >>> >>> >> [/dev/mapper/cluvg1-clulv1]. Ex
> >> >>> >>> >> pected /dev/??? to exist
> >> >>> >>> >>
> >> >>> >>> >>
> >> >>> >>> >> from the LVM agent, it checked if the volume is already
> >> available..
> >> >>> and
> >> >>> >>> >> will raise the above error if not. But, I don't see that it
> >> tries to
> >> >>> >>> >> activate it before raising the VG. Perhaps, it assumes that
> the
> >> VG
> >> >>> is
> >> >>> >>> >> already activated... so, I'm not sure who should be
> activating
> >> it
> >> >>> >>> (should
> >> >>> >>> >> it be LVM?).
> >> >>> >>> >>
> >> >>> >>> >>
> >> >>> >>> >>  if [ $rc -ne 0 ]; then
> >> >>> >>> >>                 ocf_log $loglevel "LVM Volume $1 is not
> >> available
> >> >>> >>> >> (stopped)"
> >> >>> >>> >>                 rc=$OCF_NOT_RUNNING
> >> >>> >>> >>         else
> >> >>> >>> >>                 case $(get_vg_mode) in
> >> >>> >>> >>                 1) # exclusive with tagging.
> >> >>> >>> >>                         # If vg is running, make sure the
> >> correct
> >> >>> tag
> >> >>> >>> is
> >> >>> >>> >> present. Otherwise we
> >> >>> >>> >>                         # can not guarantee exclusive
> >> activation.
> >> >>> >>> >>                         if ! check_tags; then
> >> >>> >>> >>                                 ocf_exit_reason "WARNING:
> >> >>> >>> >> $OCF_RESKEY_volgrpname is active without the cluster tag,
> >> >>> \"$OUR_TAG\""
> >> >>> >>> >>
> >> >>> >>> >> On Mon, Dec 29, 2014 at 3:36 PM, emmanuel segura <
> >> >>> [email protected]>
> >> >>> >>> >> wrote:
> >> >>> >>> >>
> >> >>> >>> >>> logs?
> >> >>> >>> >>>
> >> >>> >>> >>> 2014-12-29 6:54 GMT+01:00 Marlon Guao <
> [email protected]>:
> >> >>> >>> >>> > Hi,
> >> >>> >>> >>> >
> >> >>> >>> >>> > just want to ask regarding the LVM resource agent on
> >> >>> >>> pacemaker/corosync.
> >> >>> >>> >>> >
> >> >>> >>> >>> > I setup 2 nodes cluster (opensuse13.2 -- my config below).
> >> The
> >> >>> >>> cluster
> >> >>> >>> >>> > works as expected, like doing a manual failover (via crm
> >> resource
> >> >>> >>> move),
> >> >>> >>> >>> > and automatic failover (by rebooting the active node for
> >> >>> instance).
> >> >>> >>> >>> But, if
> >> >>> >>> >>> > i try to just "shutoff" the active node (it's a VM, so I
> can
> >> do a
> >> >>> >>> >>> > poweroff). The resources won't be able to failover to the
> >> passive
> >> >>> >>> node.
> >> >>> >>> >>> > when I did an investigation, it's due to an LVM resource
> not
> >> >>> >>> starting
> >> >>> >>> >>> > (specifically, the VG). I found out that the LVM resource
> >> won't
> >> >>> try
> >> >>> >>> to
> >> >>> >>> >>> > activate the volume group in the passive node. Is this an
> >> >>> expected
> >> >>> >>> >>> > behaviour?
> >> >>> >>> >>> >
> >> >>> >>> >>> > what I really expect is that, in the event that the active
> >> node
> >> >>> be
> >> >>> >>> >>> shutoff
> >> >>> >>> >>> > (by a power outage for instance), all resources should be
> >> >>> failover
> >> >>> >>> >>> > automatically to the passive. LVM should re-activate the
> VG.
> >> >>> >>> >>> >
> >> >>> >>> >>> >
> >> >>> >>> >>> > here's my config.
> >> >>> >>> >>> >
> >> >>> >>> >>> > node 1: s1
> >> >>> >>> >>> > node 2: s2
> >> >>> >>> >>> > primitive cluIP IPaddr2 \
> >> >>> >>> >>> > params ip=192.168.13.200 cidr_netmask=32 \
> >> >>> >>> >>> > op monitor interval=30s
> >> >>> >>> >>> > primitive clvm ocf:lvm2:clvmd \
> >> >>> >>> >>> > params daemon_timeout=30 \
> >> >>> >>> >>> > op monitor timeout=90 interval=30
> >> >>> >>> >>> > primitive dlm ocf:pacemaker:controld \
> >> >>> >>> >>> > op monitor interval=60s timeout=90s on-fail=ignore \
> >> >>> >>> >>> > op start interval=0 timeout=90
> >> >>> >>> >>> > primitive fs1 Filesystem \
> >> >>> >>> >>> > params device="/dev/mapper/cluvg1-clulv1"
> directory="/data"
> >> >>> >>> fstype=btrfs
> >> >>> >>> >>> > primitive mariadb mysql \
> >> >>> >>> >>> > params config="/etc/my.cnf"
> >> >>> >>> >>> > primitive sbd stonith:external/sbd \
> >> >>> >>> >>> > op monitor interval=15s timeout=60s
> >> >>> >>> >>> > primitive vg1 LVM \
> >> >>> >>> >>> > params volgrpname=cluvg1 exclusive=yes \
> >> >>> >>> >>> > op start timeout=10s interval=0 \
> >> >>> >>> >>> > op stop interval=0 timeout=10 \
> >> >>> >>> >>> > op monitor interval=10 timeout=30 on-fail=restart depth=0
> >> >>> >>> >>> > group base-group dlm clvm
> >> >>> >>> >>> > group rgroup cluIP vg1 fs1 mariadb \
> >> >>> >>> >>> > meta target-role=Started
> >> >>> >>> >>> > clone base-clone base-group \
> >> >>> >>> >>> > meta interleave=true target-role=Started
> >> >>> >>> >>> > property cib-bootstrap-options: \
> >> >>> >>> >>> > dc-version=1.1.12-1.1.12.git20140904.266d5c2 \
> >> >>> >>> >>> > cluster-infrastructure=corosync \
> >> >>> >>> >>> > no-quorum-policy=ignore \
> >> >>> >>> >>> > last-lrm-refresh=1419514875 \
> >> >>> >>> >>> > cluster-name=xxx \
> >> >>> >>> >>> > stonith-enabled=true
> >> >>> >>> >>> > rsc_defaults rsc-options: \
> >> >>> >>> >>> > resource-stickiness=100
> >> >>> >>> >>> >
> >> >>> >>> >>> > --
> >> >>> >>> >>> >>>> import this
> >> >>> >>> >>> > _______________________________________________
> >> >>> >>> >>> > Linux-HA mailing list
> >> >>> >>> >>> > [email protected]
> >> >>> >>> >>> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >> >>> >>> >>> > See also: http://linux-ha.org/ReportingProblems
> >> >>> >>> >>>
> >> >>> >>> >>>
> >> >>> >>> >>>
> >> >>> >>> >>> --
> >> >>> >>> >>> esta es mi vida e me la vivo hasta que dios quiera
> >> >>> >>> >>> _______________________________________________
> >> >>> >>> >>> Linux-HA mailing list
> >> >>> >>> >>> [email protected]
> >> >>> >>> >>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >> >>> >>> >>> See also: http://linux-ha.org/ReportingProblems
> >> >>> >>> >>>
> >> >>> >>> >>
> >> >>> >>> >>
> >> >>> >>> >>
> >> >>> >>> >> --
> >> >>> >>> >> >>> import this
> >> >>> >>> >>
> >> >>> >>> >
> >> >>> >>> >
> >> >>> >>> >
> >> >>> >>> > --
> >> >>> >>> >>>> import this
> >> >>> >>> > _______________________________________________
> >> >>> >>> > Linux-HA mailing list
> >> >>> >>> > [email protected]
> >> >>> >>> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >> >>> >>> > See also: http://linux-ha.org/ReportingProblems
> >> >>> >>>
> >> >>> >>>
> >> >>> >>>
> >> >>> >>> --
> >> >>> >>> esta es mi vida e me la vivo hasta que dios quiera
> >> >>> >>> _______________________________________________
> >> >>> >>> Linux-HA mailing list
> >> >>> >>> [email protected]
> >> >>> >>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >> >>> >>> See also: http://linux-ha.org/ReportingProblems
> >> >>> >>>
> >> >>> >>
> >> >>> >>
> >> >>> >>
> >> >>> >> --
> >> >>> >> >>> import this
> >> >>> >>
> >> >>> >
> >> >>> >
> >> >>> >
> >> >>> > --
> >> >>> >>>> import this
> >> >>> > _______________________________________________
> >> >>> > Linux-HA mailing list
> >> >>> > [email protected]
> >> >>> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >> >>> > See also: http://linux-ha.org/ReportingProblems
> >> >>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>> esta es mi vida e me la vivo hasta que dios quiera
> >> >>> _______________________________________________
> >> >>> Linux-HA mailing list
> >> >>> [email protected]
> >> >>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >> >>> See also: http://linux-ha.org/ReportingProblems
> >> >>>
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >>>>> import this
> >> >> _______________________________________________
> >> >> Linux-HA mailing list
> >> >> [email protected]
> >> >> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >> >> See also: http://linux-ha.org/ReportingProblems
> >> >
> >> >
> >> >
> >> > --
> >> > esta es mi vida e me la vivo hasta que dios quiera
> >>
> >>
> >>
> >> --
> >> esta es mi vida e me la vivo hasta que dios quiera
> >> _______________________________________________
> >> Linux-HA mailing list
> >> [email protected]
> >> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >> See also: http://linux-ha.org/ReportingProblems
> >>
> >
> >
> >
> > --
> >>>> import this
> > _______________________________________________
> > Linux-HA mailing list
> > [email protected]
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
>
>
>
> --
> esta es mi vida e me la vivo hasta que dios quiera
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>



-- 
>>> import this
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] pacemaker/heartbeat LVM

Reply via email to