Hi, ah yeah.. tried to poweroff the active node.. and tried pvscan on the passive.. and yes.. it didn't worked --- it doesn't return to the shell. So, the problem is on DLM?
On Mon, Dec 29, 2014 at 5:51 PM, emmanuel segura <[email protected]> wrote: > Power off the active node and after one seconde try to use one lvm > command, for example pvscan, if this command doesn't response is > because dlm relay on cluster fencing, if the cluster fencing doesn't > work the dlm state in blocked state. > > 2014-12-29 10:43 GMT+01:00 Marlon Guao <[email protected]>: > > perhaps, we need to focus on this message. as mentioned.. the cluster is > > working fine under normal circumstances. my only concern is that, LVM > > resource agent doesn't try to re-activate the VG on the passive node when > > the active node goes down ungracefully (powered off). Hence, it could not > > mount the filesystems.. etc. > > > > > > Dec 29 17:12:26 s1 crmd[1495]: notice: process_lrm_event: Operation > > sbd_monitor_0: not running (node= > > s1, call=5, rc=7, cib-update=35, confirmed=true) > > Dec 29 17:12:26 s1 crmd[1495]: notice: te_rsc_command: Initiating > action > > 13: monitor dlm:0_monitor_0 > > on s2 > > Dec 29 17:12:26 s1 crmd[1495]: notice: te_rsc_command: Initiating > action > > 5: monitor dlm:1_monitor_0 o > > n s1 (local) > > Dec 29 17:12:26 s1 crmd[1495]: notice: process_lrm_event: Operation > > dlm_monitor_0: not running (node= > > s1, call=10, rc=7, cib-update=36, confirmed=true) > > Dec 29 17:12:26 s1 crmd[1495]: notice: te_rsc_command: Initiating > action > > 14: monitor clvm:0_monitor_0 > > on s2 > > Dec 29 17:12:26 s1 crmd[1495]: notice: te_rsc_command: Initiating > action > > 6: monitor clvm:1_monitor_0 > > on s1 (local) > > Dec 29 17:12:26 s1 crmd[1495]: notice: process_lrm_event: Operation > > clvm_monitor_0: not running (node > > =s1, call=15, rc=7, cib-update=37, confirmed=true) > > Dec 29 17:12:26 s1 crmd[1495]: notice: te_rsc_command: Initiating > action > > 15: monitor cluIP_monitor_0 > > on s2 > > Dec 29 17:12:26 s1 crmd[1495]: notice: te_rsc_command: Initiating > action > > 7: monitor cluIP_monitor_0 o > > n s1 (local) > > Dec 29 17:12:26 s1 crmd[1495]: notice: process_lrm_event: Operation > > cluIP_monitor_0: not running (nod > > e=s1, call=19, rc=7, cib-update=38, confirmed=true) > > Dec 29 17:12:26 s1 crmd[1495]: notice: te_rsc_command: Initiating > action > > 16: monitor vg1_monitor_0 on > > s2 > > Dec 29 17:12:26 s1 crmd[1495]: notice: te_rsc_command: Initiating > action > > 8: monitor vg1_monitor_0 on > > s1 (local) > > Dec 29 17:12:26 s1 LVM(vg1)[1583]: WARNING: LVM Volume cluvg1 is not > > available (stopped) > > Dec 29 17:12:26 s1 crmd[1495]: notice: process_lrm_event: Operation > > vg1_monitor_0: not running (node= > > s1, call=23, rc=7, cib-update=39, confirmed=true) > > Dec 29 17:12:26 s1 crmd[1495]: notice: te_rsc_command: Initiating > action > > 17: monitor fs1_monitor_0 on > > s2 > > Dec 29 17:12:26 s1 crmd[1495]: notice: te_rsc_command: Initiating > action > > 9: monitor fs1_monitor_0 on > > s1 (local) > > Dec 29 17:12:26 s1 Filesystem(fs1)[1600]: WARNING: Couldn't find device > > [/dev/mapper/cluvg1-clulv1]. Ex > > pected /dev/??? to exist > > Dec 29 17:12:26 s1 crmd[1495]: notice: process_lrm_event: Operation > > fs1_monitor_0: not running (node= > > s1, call=27, rc=7, cib-update=40, confirmed=true) > > > > On Mon, Dec 29, 2014 at 5:38 PM, emmanuel segura <[email protected]> > wrote: > > > >> Dec 27 15:38:00 s1 cib[1514]: error: crm_xml_err: XML Error: > >> Permission deniedPermission deniedI/O warning : failed to load > >> external entity "/var/lib/pacemaker/cib/cib.xml" > >> Dec 27 15:38:00 s1 cib[1514]: error: write_cib_contents: Cannot > >> link /var/lib/pacemaker/cib/cib.xml to > >> /var/lib/pacemaker/cib/cib-0.raw: Operation not permitted (1) > >> > >> 2014-12-29 10:33 GMT+01:00 emmanuel segura <[email protected]>: > >> > Hi, > >> > > >> > You have a problem with the cluster stonithd:"error: crm_abort: > >> > crm_glib_handler: Forked child 6186 to record non-fatal assert at > >> > logging.c:73 " > >> > > >> > Try to post your cluster version(packages), maybe someone can tell you > >> > if this is a known bug or new. > >> > > >> > > >> > > >> > 2014-12-29 10:29 GMT+01:00 Marlon Guao <[email protected]>: > >> >> ok, sorry for that.. please use this instead. > >> >> > >> >> http://pastebin.centos.org/14771/ > >> >> > >> >> thanks. > >> >> > >> >> On Mon, Dec 29, 2014 at 5:25 PM, emmanuel segura <[email protected] > > > >> wrote: > >> >> > >> >>> Sorry, > >> >>> > >> >>> But your paste is empty. > >> >>> > >> >>> 2014-12-29 10:19 GMT+01:00 Marlon Guao <[email protected]>: > >> >>> > hi, > >> >>> > > >> >>> > uploaded it here. > >> >>> > > >> >>> > http://susepaste.org/45413433 > >> >>> > > >> >>> > thanks. > >> >>> > > >> >>> > On Mon, Dec 29, 2014 at 5:09 PM, Marlon Guao < > [email protected]> > >> >>> wrote: > >> >>> > > >> >>> >> Ok, i attached the log file of one of the nodes. > >> >>> >> > >> >>> >> On Mon, Dec 29, 2014 at 4:42 PM, emmanuel segura < > >> [email protected]> > >> >>> >> wrote: > >> >>> >> > >> >>> >>> please use pastebin and show your whole logs > >> >>> >>> > >> >>> >>> 2014-12-29 9:06 GMT+01:00 Marlon Guao <[email protected]>: > >> >>> >>> > by the way.. just to note that.. for a normal testing (manual > >> >>> failover, > >> >>> >>> > rebooting the active node)... the cluster is working fine. I > only > >> >>> >>> encounter > >> >>> >>> > this error if I try to poweroff/shutoff the active node. > >> >>> >>> > > >> >>> >>> > On Mon, Dec 29, 2014 at 4:05 PM, Marlon Guao < > >> [email protected]> > >> >>> >>> wrote: > >> >>> >>> > > >> >>> >>> >> Hi. > >> >>> >>> >> > >> >>> >>> >> > >> >>> >>> >> Dec 29 13:47:16 s1 LVM(vg1)[1601]: WARNING: LVM Volume cluvg1 > >> is not > >> >>> >>> >> available (stopped) > >> >>> >>> >> Dec 29 13:47:16 s1 crmd[1515]: notice: process_lrm_event: > >> >>> Operation > >> >>> >>> >> vg1_monitor_0: not running (node= > >> >>> >>> >> s1, call=23, rc=7, cib-update=40, confirmed=true) > >> >>> >>> >> Dec 29 13:47:16 s1 crmd[1515]: notice: te_rsc_command: > >> Initiating > >> >>> >>> action > >> >>> >>> >> 9: monitor fs1_monitor_0 on > >> >>> >>> >> s1 (local) > >> >>> >>> >> Dec 29 13:47:16 s1 crmd[1515]: notice: te_rsc_command: > >> Initiating > >> >>> >>> action > >> >>> >>> >> 16: monitor vg1_monitor_0 on > >> >>> >>> >> s2 > >> >>> >>> >> Dec 29 13:47:16 s1 Filesystem(fs1)[1618]: WARNING: Couldn't > find > >> >>> device > >> >>> >>> >> [/dev/mapper/cluvg1-clulv1]. Ex > >> >>> >>> >> pected /dev/??? to exist > >> >>> >>> >> > >> >>> >>> >> > >> >>> >>> >> from the LVM agent, it checked if the volume is already > >> available.. > >> >>> and > >> >>> >>> >> will raise the above error if not. But, I don't see that it > >> tries to > >> >>> >>> >> activate it before raising the VG. Perhaps, it assumes that > the > >> VG > >> >>> is > >> >>> >>> >> already activated... so, I'm not sure who should be > activating > >> it > >> >>> >>> (should > >> >>> >>> >> it be LVM?). > >> >>> >>> >> > >> >>> >>> >> > >> >>> >>> >> if [ $rc -ne 0 ]; then > >> >>> >>> >> ocf_log $loglevel "LVM Volume $1 is not > >> available > >> >>> >>> >> (stopped)" > >> >>> >>> >> rc=$OCF_NOT_RUNNING > >> >>> >>> >> else > >> >>> >>> >> case $(get_vg_mode) in > >> >>> >>> >> 1) # exclusive with tagging. > >> >>> >>> >> # If vg is running, make sure the > >> correct > >> >>> tag > >> >>> >>> is > >> >>> >>> >> present. Otherwise we > >> >>> >>> >> # can not guarantee exclusive > >> activation. > >> >>> >>> >> if ! check_tags; then > >> >>> >>> >> ocf_exit_reason "WARNING: > >> >>> >>> >> $OCF_RESKEY_volgrpname is active without the cluster tag, > >> >>> \"$OUR_TAG\"" > >> >>> >>> >> > >> >>> >>> >> On Mon, Dec 29, 2014 at 3:36 PM, emmanuel segura < > >> >>> [email protected]> > >> >>> >>> >> wrote: > >> >>> >>> >> > >> >>> >>> >>> logs? > >> >>> >>> >>> > >> >>> >>> >>> 2014-12-29 6:54 GMT+01:00 Marlon Guao < > [email protected]>: > >> >>> >>> >>> > Hi, > >> >>> >>> >>> > > >> >>> >>> >>> > just want to ask regarding the LVM resource agent on > >> >>> >>> pacemaker/corosync. > >> >>> >>> >>> > > >> >>> >>> >>> > I setup 2 nodes cluster (opensuse13.2 -- my config below). > >> The > >> >>> >>> cluster > >> >>> >>> >>> > works as expected, like doing a manual failover (via crm > >> resource > >> >>> >>> move), > >> >>> >>> >>> > and automatic failover (by rebooting the active node for > >> >>> instance). > >> >>> >>> >>> But, if > >> >>> >>> >>> > i try to just "shutoff" the active node (it's a VM, so I > can > >> do a > >> >>> >>> >>> > poweroff). The resources won't be able to failover to the > >> passive > >> >>> >>> node. > >> >>> >>> >>> > when I did an investigation, it's due to an LVM resource > not > >> >>> >>> starting > >> >>> >>> >>> > (specifically, the VG). I found out that the LVM resource > >> won't > >> >>> try > >> >>> >>> to > >> >>> >>> >>> > activate the volume group in the passive node. Is this an > >> >>> expected > >> >>> >>> >>> > behaviour? > >> >>> >>> >>> > > >> >>> >>> >>> > what I really expect is that, in the event that the active > >> node > >> >>> be > >> >>> >>> >>> shutoff > >> >>> >>> >>> > (by a power outage for instance), all resources should be > >> >>> failover > >> >>> >>> >>> > automatically to the passive. LVM should re-activate the > VG. > >> >>> >>> >>> > > >> >>> >>> >>> > > >> >>> >>> >>> > here's my config. > >> >>> >>> >>> > > >> >>> >>> >>> > node 1: s1 > >> >>> >>> >>> > node 2: s2 > >> >>> >>> >>> > primitive cluIP IPaddr2 \ > >> >>> >>> >>> > params ip=192.168.13.200 cidr_netmask=32 \ > >> >>> >>> >>> > op monitor interval=30s > >> >>> >>> >>> > primitive clvm ocf:lvm2:clvmd \ > >> >>> >>> >>> > params daemon_timeout=30 \ > >> >>> >>> >>> > op monitor timeout=90 interval=30 > >> >>> >>> >>> > primitive dlm ocf:pacemaker:controld \ > >> >>> >>> >>> > op monitor interval=60s timeout=90s on-fail=ignore \ > >> >>> >>> >>> > op start interval=0 timeout=90 > >> >>> >>> >>> > primitive fs1 Filesystem \ > >> >>> >>> >>> > params device="/dev/mapper/cluvg1-clulv1" > directory="/data" > >> >>> >>> fstype=btrfs > >> >>> >>> >>> > primitive mariadb mysql \ > >> >>> >>> >>> > params config="/etc/my.cnf" > >> >>> >>> >>> > primitive sbd stonith:external/sbd \ > >> >>> >>> >>> > op monitor interval=15s timeout=60s > >> >>> >>> >>> > primitive vg1 LVM \ > >> >>> >>> >>> > params volgrpname=cluvg1 exclusive=yes \ > >> >>> >>> >>> > op start timeout=10s interval=0 \ > >> >>> >>> >>> > op stop interval=0 timeout=10 \ > >> >>> >>> >>> > op monitor interval=10 timeout=30 on-fail=restart depth=0 > >> >>> >>> >>> > group base-group dlm clvm > >> >>> >>> >>> > group rgroup cluIP vg1 fs1 mariadb \ > >> >>> >>> >>> > meta target-role=Started > >> >>> >>> >>> > clone base-clone base-group \ > >> >>> >>> >>> > meta interleave=true target-role=Started > >> >>> >>> >>> > property cib-bootstrap-options: \ > >> >>> >>> >>> > dc-version=1.1.12-1.1.12.git20140904.266d5c2 \ > >> >>> >>> >>> > cluster-infrastructure=corosync \ > >> >>> >>> >>> > no-quorum-policy=ignore \ > >> >>> >>> >>> > last-lrm-refresh=1419514875 \ > >> >>> >>> >>> > cluster-name=xxx \ > >> >>> >>> >>> > stonith-enabled=true > >> >>> >>> >>> > rsc_defaults rsc-options: \ > >> >>> >>> >>> > resource-stickiness=100 > >> >>> >>> >>> > > >> >>> >>> >>> > -- > >> >>> >>> >>> >>>> import this > >> >>> >>> >>> > _______________________________________________ > >> >>> >>> >>> > Linux-HA mailing list > >> >>> >>> >>> > [email protected] > >> >>> >>> >>> > http://lists.linux-ha.org/mailman/listinfo/linux-ha > >> >>> >>> >>> > See also: http://linux-ha.org/ReportingProblems > >> >>> >>> >>> > >> >>> >>> >>> > >> >>> >>> >>> > >> >>> >>> >>> -- > >> >>> >>> >>> esta es mi vida e me la vivo hasta que dios quiera > >> >>> >>> >>> _______________________________________________ > >> >>> >>> >>> Linux-HA mailing list > >> >>> >>> >>> [email protected] > >> >>> >>> >>> http://lists.linux-ha.org/mailman/listinfo/linux-ha > >> >>> >>> >>> See also: http://linux-ha.org/ReportingProblems > >> >>> >>> >>> > >> >>> >>> >> > >> >>> >>> >> > >> >>> >>> >> > >> >>> >>> >> -- > >> >>> >>> >> >>> import this > >> >>> >>> >> > >> >>> >>> > > >> >>> >>> > > >> >>> >>> > > >> >>> >>> > -- > >> >>> >>> >>>> import this > >> >>> >>> > _______________________________________________ > >> >>> >>> > Linux-HA mailing list > >> >>> >>> > [email protected] > >> >>> >>> > http://lists.linux-ha.org/mailman/listinfo/linux-ha > >> >>> >>> > See also: http://linux-ha.org/ReportingProblems > >> >>> >>> > >> >>> >>> > >> >>> >>> > >> >>> >>> -- > >> >>> >>> esta es mi vida e me la vivo hasta que dios quiera > >> >>> >>> _______________________________________________ > >> >>> >>> Linux-HA mailing list > >> >>> >>> [email protected] > >> >>> >>> http://lists.linux-ha.org/mailman/listinfo/linux-ha > >> >>> >>> See also: http://linux-ha.org/ReportingProblems > >> >>> >>> > >> >>> >> > >> >>> >> > >> >>> >> > >> >>> >> -- > >> >>> >> >>> import this > >> >>> >> > >> >>> > > >> >>> > > >> >>> > > >> >>> > -- > >> >>> >>>> import this > >> >>> > _______________________________________________ > >> >>> > Linux-HA mailing list > >> >>> > [email protected] > >> >>> > http://lists.linux-ha.org/mailman/listinfo/linux-ha > >> >>> > See also: http://linux-ha.org/ReportingProblems > >> >>> > >> >>> > >> >>> > >> >>> -- > >> >>> esta es mi vida e me la vivo hasta que dios quiera > >> >>> _______________________________________________ > >> >>> Linux-HA mailing list > >> >>> [email protected] > >> >>> http://lists.linux-ha.org/mailman/listinfo/linux-ha > >> >>> See also: http://linux-ha.org/ReportingProblems > >> >>> > >> >> > >> >> > >> >> > >> >> -- > >> >>>>> import this > >> >> _______________________________________________ > >> >> Linux-HA mailing list > >> >> [email protected] > >> >> http://lists.linux-ha.org/mailman/listinfo/linux-ha > >> >> See also: http://linux-ha.org/ReportingProblems > >> > > >> > > >> > > >> > -- > >> > esta es mi vida e me la vivo hasta que dios quiera > >> > >> > >> > >> -- > >> esta es mi vida e me la vivo hasta que dios quiera > >> _______________________________________________ > >> Linux-HA mailing list > >> [email protected] > >> http://lists.linux-ha.org/mailman/listinfo/linux-ha > >> See also: http://linux-ha.org/ReportingProblems > >> > > > > > > > > -- > >>>> import this > > _______________________________________________ > > Linux-HA mailing list > > [email protected] > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > See also: http://linux-ha.org/ReportingProblems > > > > -- > esta es mi vida e me la vivo hasta que dios quiera > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > -- >>> import this _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
