On 07/23/2012 07:06 AM, David Barchas wrote: > Hello. > > I have been working on this for 3 days now, and must be so stressed out > that I am being blinded to what is probably an obvious cause of this. In > a word, HELP. > > I am trying specifically to utilize ocf:heartbeat:IPaddr2, but this > issue seems to occur with any of the ocf:heartbeat agents. I will just > focus on IPaddr2 for purposes of figuring this out, but it happens > exactly the same with any of the default agents. However, I can > successfully use ocf:linbit:drbd for example. it seems to be limited to > the RAs that are installed along with coro/pace in the resource-agents > package.
What are the exact package versions you have installed? pacemaker* resource-agents cluster-glue* > > I am using CentOS 6.3, fully updated (though this happens in 6.2 with no > updates as well). Install pacemaker/coro from default repo. I have > stripped everything down to figure this out in vmware and just install > centos, update it, install pace/coro (no drbd for this discussion), > configure coro, and then start it. pacemaker starts up fine (or at least > I think its fine). I can set quorum ignore for example from crm. (crm > configure property no-quorum-policy="ignore") > > here is the process list > root 1447 0.3 0.6 556080 6636 ? Ssl 21:09 0:00 corosync > 499 1453 0.0 0.5 88720 5556 ? S 21:09 0:00 \_ > /usr/libexec/pacemaker/cib > root 1454 0.0 0.3 86968 3488 ? S 21:09 0:00 \_ > /usr/libexec/pacemaker/stonithd > root 1455 0.0 0.2 76188 2492 ? S 21:09 0:00 \_ > /usr/lib64/heartbeat/lrmd > 499 1456 0.0 0.3 91160 3432 ? S 21:09 0:00 \_ > /usr/libexec/pacemaker/attrd > 499 1457 0.0 0.3 87440 3824 ? S 21:09 0:00 \_ > /usr/libexec/pacemaker/pengine > 499 1458 0.0 0.3 91312 3884 ? S 21:09 0:00 \_ > /usr/libexec/pacemaker/crmd so you are using plugin version 0 to start Pacemaker .... That would explain why /etc/init.d/pacemaker is unable to start ... it is already started by Corosync. > > 499 is hacluster btw. > > ***BUT*** > > When I run as root the following: > # crm ra meta ocf:heartbeat:IPaddr2 > > I get this response: > lrmadmin[1484]: 2012/07/22_13:28:23 ERROR: > lrm_get_rsc_type_metadata(578): got a return code HA_FAIL from a reply > message of rmetadata with function get_ret_from_msg. > ERROR: ocf:heartbeat:IPaddr2: could not parse meta-data: > > And this is in /var/log/messages: > Jul 22 16:35:14 MST lrmd: [48093]: ERROR: get_resource_meta: pclose > failed: Resource temporarily unavailable > Jul 22 16:35:14 MST lrmd: [48093]: WARN: on_msg_get_metadata: empty > metadata for ocf::heartbeat::IPaddr2. > Jul 22 16:35:14 MST lrmd: [48093]: WARN: G_SIG_dispatch: Dispatch > function for SIGCHLD was delayed 200 ms (> 100 ms) before being called > (GSource: 0x187df10) > Jul 22 16:35:14 MST lrmd: [48093]: info: G_SIG_dispatch: started at > 429616889 should have started at 429616869 > Jul 22 16:35:14 MST lrmadmin: [48254]: ERROR: > lrm_get_rsc_type_metadata(578): got a return code HA_FAIL from a reply > message of rmetadata with function get_ret_from_msg. > > I am using crm ra meta as a way to test, but crm will not accept my > trying to add the resource as a primitive either. > > In my research, I have found that often it's permissions. So just to > rule that out i set my entire system to 777 permissions. no joy. > > Another suggestion i find often has been to set OCF_ROOT (export > OCF_ROOT=/usr/lib/ocf) and then do > /usr/lib/ocf/resource.d/heartbeat/IPaddr2 meta-data. > That produces the desired output. But does not work before i export. > And CRM still does not accept my meta request > > Another suggestion i find is to make sure that shellfuncs exists in the > agents folder. the soft links exist > lrwxrwxrwx. 1 root root 32 Jul 22 04:08 .ocf-binaries -> > ../../lib/heartbeat/ocf-binaries > lrwxrwxrwx. 1 root root 35 Jul 22 04:08 .ocf-directories -> > ../../lib/heartbeat/ocf-directories > lrwxrwxrwx. 1 root root 35 Jul 22 04:08 .ocf-returncodes -> > ../../lib/heartbeat/ocf-returncodes > lrwxrwxrwx. 1 root root 34 Jul 22 04:08 .ocf-shellfuncs -> > ../../lib/heartbeat/ocf-shellfuncs > > And just to make sure I did un-hidden soft links as well with no joy. Strange, that errors are typically related to wrong paths for initialization of environment and helper functions: # Initialization: : ${OCF_FUNCTIONS_DIR=${OCF_ROOT}/lib/heartbeat} . ${OCF_FUNCTIONS_DIR}/ocf-shellfuncs DRBD agent has an extra failback check, that may be the reason that it still works ... # Resource-agents have moved their ocf-shellfuncs file around. # There are supposed to be symlinks or wrapper files in the old location, # pointing to the new one, but people seem to get it wrong all the time. # Try several locations. if test -n "${OCF_FUNCTIONS_DIR}" ; then if test -e "${OCF_FUNCTIONS_DIR}/ocf-shellfuncs" ; then . "${OCF_FUNCTIONS_DIR}/ocf-shellfuncs" elif test -e "${OCF_FUNCTIONS_DIR}/.ocf-shellfuncs" ; then . "${OCF_FUNCTIONS_DIR}/.ocf-shellfuncs" fi else if test -e "${OCF_ROOT}/lib/heartbeat/ocf-shellfuncs" ; then . "${OCF_ROOT}/lib/heartbeat/ocf-shellfuncs" elif test -e "${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs"; then . "${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs" fi fi Regards, Andreas -- Need help with Pacemaker? http://www.hastexo.com/now > > I have used assorted "how to's" to troubleshoot and make sure Im not > missing something simple. > http://www.server-world.info/en/note?os=CentOS_6&p=pacemaker&f=1 > http://snozberry.org/blog/2012/05/02/corosync-slash-pacemaker-on-centos-6/ > > one other strange (but might be normal) behavior is that I cannot > manually start pacemaker via "service pacemaker start" > it fails, but I get no information in the logs. But I get the feeling > this is normal behavior now? > # service pacemaker start > Starting Pacemaker Cluster Manager: [FAILED] > log shows 1 entry: Jul 22 22:00:50 MST pacemakerd[1511]: info: > crm_log_init_worker: Changed active directory to > /var/lib/heartbeat/cores/root > > > I have run through it about 30 times at this point. > I have tried cent 6.2 not updated. cent 6.3 fully updated. on a physical > server (just in case my VM is doing something weird) and in VMs. > > Frankly I am so baffled by this, and have been working so intensely on > it, that I am hoping that I am just missing something subtle because of > freaking out. > This should be very straightforward. No magic, but obviously "something" > is amiss. > But what's really weird is that I cannot find a single post online of > anyone having issues with the standard RAs like this. > > I can try anything suggested, except change from centos 6. This is all > being done in a pair of virtuals. > > Any help or suggestions at all will be greatly appreciated. > I am a bit desperate now. > Thanks. > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org >
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org