On 27/08/2013, at 8:54 PM, Thomas Schulte <[email protected]> wrote:

> Hi list,
> 
> I'm experiencing a strange problem an I can't figure out what's wrong. I'm 
> running openSUSE 12.3 on a 2-node-cluster with pacemaker-1.1.9-55.2.x86_64.
> 
> Every 15 minutes the following messages are logged. If stonith is enabled, my 
> hosts get fenced afterwards.
> The messages appeared on both nodes until I temporarily activated stonith.
> After some fencing, now only the second node "s00202" keeps logging these 
> (both nodes online again):
> 
> Aug 27 10:21:13 s00202 crmd[3390]:   notice: do_state_transition: State 
> transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED 
> origin=crm_timer_popped ]
> Aug 27 10:21:13 s00202 pengine[3389]:   notice: unpack_config: On loss of CCM 
> Quorum: Ignore
> Aug 27 10:21:13 s00202 pengine[3389]:    error: filter_colocation_constraint: 
> pri_fs_vmail and ms_drbd_vmail are both allocated but to different nodes: 
> s00202 vs. n/a
> Aug 27 10:21:13 s00202 pengine[3389]:    error: filter_colocation_constraint: 
> pri_fs_www and ms_drbd_www are both allocated but to different nodes: s00201 
> vs. n/a
> Aug 27 10:21:13 s00202 pengine[3389]:    error: filter_colocation_constraint: 
> pri_fs_www and ms_drbd_www are both allocated but to different nodes: s00201 
> vs. n/a
> Aug 27 10:21:13 s00202 pengine[3389]:    error: filter_colocation_constraint: 
> pri_fs_mysql and ms_drbd_mysql are both allocated but to different nodes: 
> s00201 vs. n/a
> Aug 27 10:21:13 s00202 pengine[3389]:    error: filter_colocation_constraint: 
> pri_fs_redis and ms_drbd_redis are both allocated but to different nodes: 
> s00201 vs. n/a
> Aug 27 10:21:13 s00202 pengine[3389]:    error: filter_colocation_constraint: 
> pri_fs_bind and ms_drbd_bind are both allocated but to different nodes: 
> s00202 vs. n/a
> Aug 27 10:21:13 s00202 pengine[3389]:    error: filter_colocation_constraint: 
> pri_fs_squid and ms_drbd_squid are both allocated but to different nodes: 
> s00202 vs. n/a
> Aug 27 10:21:13 s00202 pengine[3389]:    error: filter_colocation_constraint: 
> pri_fs_www and ms_drbd_www are both allocated but to different nodes: s00201 
> vs. n/a
> Aug 27 10:21:13 s00202 pengine[3389]: last message repeated 3 times
> Aug 27 10:21:13 s00202 crmd[3390]:   notice: do_te_invoke: Processing graph 
> 61 (ref=pe_calc-dc-1377591673-1345) derived from 
> /var/lib/pacemaker/pengine/pe-input-53.bz2
> Aug 27 10:21:13 s00202 crmd[3390]:   notice: run_graph: Transition 61 
> (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, 
> Source=/var/lib/pacemaker/pengine/pe-input-53.bz2): Complete
> Aug 27 10:21:13 s00202 crmd[3390]:   notice: do_state_transition: State 
> transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS 
> cause=C_FSA_INTERNAL origin=notify_crmd ]
> Aug 27 10:21:13 s00202 pengine[3389]:   notice: process_pe_message: 
> Calculated Transition 61: /var/lib/pacemaker/pengine/pe-input-53.bz2

Could you attach this file please? 
I'll be able to see if the current version behaves any better. 

> 
> 
> I believe that the errors about filter_colocation_contraint all have the same 
> cause, so I'm going to reduce the problem to the following:
> 
> Aug 27 10:21:13 s00202 pengine[3389]:    error: filter_colocation_constraint: 
> pri_fs_vmail and ms_drbd_vmail are both allocated but to different nodes: 
> s00202 vs. n/a
> 
> 
> This is my configuration (relevant extract):
> 
> # crm configure show
> 
> node s00201
> node s00202
> primitive pri_drbd_vmail ocf:linbit:drbd \
> operations $id="pri_drbd_vmail-operations" \
> op monitor interval="20" role="Slave" timeout="20" \
> op monitor interval="10" role="Master" timeout="20" \
> params drbd_resource="vmail"
> primitive pri_fs_vmail ocf:heartbeat:Filesystem \
> params device="/dev/drbd5" fstype="ext4" directory="/var/vmail" \
> op monitor interval="30"
> primitive pri_ip_vmail ocf:heartbeat:IPaddr2 \
> operations $id="pri_ip_vmail-operations" \
> op monitor interval="10s" timeout="20s" \
> params ip="10.0.1.105" nic="br0"
> primitive pri_svc_postfix ocf:heartbeat:postfix \
> operations $id="pri_svc_postfix-operations" \
> op monitor interval="60s" timeout="20s" \
> params config_dir="/etc/postfix_vmail"
> group grp_vmail pri_fs_vmail pri_ip_vmail pri_svc_postfix \
> meta target-role="Started"
> ms ms_drbd_vmail pri_drbd_vmail \
> meta notify="true" target-role="Started" master-max="1" is-managed="true"
> colocation col_grp_vmail_ON_drbd_vmail inf: grp_vmail:Started 
> ms_drbd_vmail:Master
> order ord_ms_drbd_vmail_BEFORE_grp_vmail inf: ms_drbd_vmail:promote 
> grp_vmail:start
> property $id="cib-bootstrap-options" \
> no-quorum-policy="ignore" \
> placement-strategy="balanced" \
> dc-version="1.1.9-2db99f1" \
> cluster-infrastructure="classic openais (with plugin)" \
> expected-quorum-votes="2" \
> last-lrm-refresh="1377585341" \
> stonith-enabled="false"
> rsc_defaults $id="rsc-options" \
> resource-stickiness="100" \
> migration-threshold="3"
> op_defaults $id="op-options" \
> timeout="600" \
> record-pending="false"
> 
> 
> # crm_resource -L
> 
> Master/Slave Set: ms_drbd_vmail [pri_drbd_vmail]
> Masters: [ s00202 ]
> Slaves: [ s00201 ]
> Resource Group: grp_vmail
> pri_fs_vmail       (ocf::heartbeat:Filesystem):    Started
> pri_ip_vmail       (ocf::heartbeat:IPaddr2):       Started
> pri_svc_postfix    (ocf::heartbeat:postfix):       Started
> 
> 
> # ptest -Ls | egrep "pri_fs_vmail|ms_drbd_vmail|grp_vmail"
> 
> group_color: grp_vmail allocation score on s00201: 0
> group_color: grp_vmail allocation score on s00202: 0
> group_color: pri_fs_vmail allocation score on s00201: 0
> group_color: pri_fs_vmail allocation score on s00202: 100
> clone_color: ms_drbd_vmail allocation score on s00201: 0
> clone_color: ms_drbd_vmail allocation score on s00202: 600
> native_color: pri_fs_vmail allocation score on s00201: -INFINITY
> native_color: pri_fs_vmail allocation score on s00202: 10700
> 
> 
> Besides these error messages, both hosts are working fine (stonith disabled).
> Starting/stopping or migrating resources is possible without problems, too.
> 
> I have no idea what's wrong and Google didn't help me in this case.
> Maybe someone here is able to help me out?
> 
> 
> Thank you.
> 
> Regards,
> Thomas
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to