[Pacemaker] why does pacemaker migrate a vm by stopping and starting instead of migrating action?

2012-12-18 Thread bin chen
Hi,all My cluster is pacemaker 1.1.7 + corosync 2.0. I have write a resource agent to manage the virtual machine.The RA supports start,stop,migrate_from,migrate_to,monitor. But when I try to migrate a running cluster vm(resource name is dcbh6f1c-GtNhnB-8597) from the host(h66) to a

Re: [Pacemaker] reloading crm changes

2012-12-18 Thread Andrew Beekhof
On Wed, Dec 19, 2012 at 3:16 AM, Paul Shannon - NOAA Federal wrote: > Ah, I see it now shows the resources I configured. I guess crm_mon with no > arguments does not show the stopped resources. Now I need to figure out why > one of my groups will not start. I notice some cib errors in the error l

Re: [Pacemaker] ocf:heartbeat:apache fails to start

2012-12-18 Thread Andrew Beekhof
On Wed, Dec 19, 2012 at 6:35 AM, Paul Shannon - NOAA Federal wrote: > I have what I thought was a pretty simple startup. Just a 2-node cluster > with 3 drbd filesystems, a virtual-ip and apache. However I cannot get my > apache resource to start. My guess is that the status url isn;t working/co

Re: [Pacemaker] wrong device in stonith_admin -l

2012-12-18 Thread Andrew Beekhof
On Wed, Dec 19, 2012 at 4:38 AM, wrote: > laurent+pacema...@u-picardie.fr writes: > >> David Vossel writes: >> Dec 12 01:12:37 elasticsearch-06 stonith-ng[18181]: notice: dynamic_list_search_cb: Disabling port list queries for stonith-xen-eddu (1): failed: 255 >>> >>> We discov

[Pacemaker] ocf:heartbeat:apache fails to start

2012-12-18 Thread Paul Shannon - NOAA Federal
I have what I thought was a pretty simple startup. Just a 2-node cluster with 3 drbd filesystems, a virtual-ip and apache. However I cannot get my apache resource to start. This is from corosync.log. Dec 18 10:20:36 [1806] ajk-s-jnusrv1.jnu.nwsar.gov crmd: info: process_lrm_event:

[Pacemaker] crm shell

2012-12-18 Thread Jay Janssen
Having learned pretty much everything I know about pacemaker (which isn't a lot) using the crm shell, I am dismayed to find it isn't included in pacemaker 1.1.8. Since when is it a good development practice to deprecate (and not only deprecate, but completely abandon and stop supporting altog

Re: [Pacemaker] wrong device in stonith_admin -l

2012-12-18 Thread laurent+pacemaker
laurent+pacema...@u-picardie.fr writes: > David Vossel writes: > >>> Dec 12 01:12:37 elasticsearch-06 stonith-ng[18181]: notice: >>> dynamic_list_search_cb: Disabling port list queries for >>> stonith-xen-eddu (1): failed: 255 >> >> We discover what hosts a agent can fence by running this comm

Re: [Pacemaker] reloading crm changes

2012-12-18 Thread Paul Shannon - NOAA Federal
Ah, I see it now shows the resources I configured. I guess crm_mon with no arguments does not show the stopped resources. Now I need to figure out why one of my groups will not start. I notice some cib errors in the error log: validate_cib_digest: Digest comparision failed error: retrieveCib: Che

Re: [Pacemaker] wrong device in stonith_admin -l

2012-12-18 Thread laurent+pacemaker
David Vossel writes: >> Dec 12 01:12:37 elasticsearch-06 stonith-ng[18181]: notice: >> dynamic_list_search_cb: Disabling port list queries for >> stonith-xen-eddu (1): failed: 255 > > We discover what hosts a agent can fence by running this command internally > in stonith. > > # agent -o list

Re: [Pacemaker] Installing pacemaker on aws ec2 server

2012-12-18 Thread Yossi Nachum
Its amazon ami, it is based on redhat On Dec 18, 2012 3:58 AM, "Andrew Beekhof" wrote: > On Mon, Dec 17, 2012 at 7:02 PM, Yossi Nachum wrote: > > I fix this error using LIBS enviroment variable > > I run: export LIBS=/lib64/libtinfo.so.5 > > then ./configure again and then make completed success

[Pacemaker] timed out / exec error

2012-12-18 Thread James Harper
For the following failure: Failed actions: p_lvm_iscsi:0_monitor_1 (node=bitvs6, call=57, rc=-2, status=Timed Out): unknown exec error Is this the ra itself returning a "Timed Out" error, or is it the cluster software determining that the ra is taking too long and so killing it and dec