Re: [Pacemaker] stonith_admin does not work as expected

2013-11-13 Thread andreas graeper
i stopped/started the resource and now stonith_admin kann see it again . pcs resource stop fence_1 pcs resource start fence_1 but how can it get lost ? thanks andreas 2013/11/13, andreas graeper : > hi, > pacemaker version is 1.1.7 > > the fence-agent (i thought was one of t

Re: [Pacemaker] stonith_admin does not work as expected

2013-11-13 Thread andreas graeper
works (I assume its a custom agent?) > > On 12 Nov 2013, at 1:21 am, andreas graeper > wrote: > >> hi, >> two nodes. >> n1 (slave) fence_2:stonith:fence_ifmib >> n2 (master) fence_1:stonith:fence_ifmib >> >> n1 was fenced cause suddenly not reac

[Pacemaker] stonith_admin does not work as expected

2013-11-11 Thread andreas graeper
hi, two nodes. n1 (slave) fence_2:stonith:fence_ifmib n2 (master) fence_1:stonith:fence_ifmib n1 was fenced cause suddenly not reachable. (reason still unknown) n2 > stonith_admin -L -> 'fence_1' n2 > stonith_admin -U fence_1 timed out n2 > stonith_admin -L -> 'no devices found' crm

Re: [Pacemaker] after `corosync stop` on master, drbd:master moves to peer, but all other resource stopped

2013-07-08 Thread andreas graeper
cant remember if at all and where i put that logs next time .. thanks 2013/7/9 Andrew Beekhof > > On 19/06/2013, at 3:03 AM, andreas graeper > wrote: > > > hi, > > i stopped n1 (drbd:master + all managed resources) > > the n2 became drbd:master but all resources

Re: [Pacemaker] crond on both nodes (active/passive) but some jobs on active only

2013-07-05 Thread andreas graeper
thanks a lot ! 2013/7/5 Lars Ellenberg > On Fri, Jul 05, 2013 at 04:52:35PM +0200, andreas graeper wrote: > > when i wrote a script handled by ocf:heartbeat:anything i.e. that is > > signalling the cron-daemon to reload crontabs > > when crontab file is enabled by symlink:

Re: [Pacemaker] crond on both nodes (active/passive) but some jobs on active only

2013-07-05 Thread andreas graeper
then R2 this implizit means R1:start , R2:start and R2:stop, R1:stop ? thanks in advance andreas 2013/7/5 andreas graeper > hi, > two nodes active/passive and fetchmail as cronjob shall run on active only. > > i use ocf:heartbeat:symlink to move / rename > > /etc/cron.d/j

[Pacemaker] crond on both nodes (active/passive) but some jobs on active only

2013-07-05 Thread andreas graeper
hi, two nodes active/passive and fetchmail as cronjob shall run on active only. i use ocf:heartbeat:symlink to move / rename /etc/cron.d/jobs <> /etc/cron.d/jobs.disable i read anywhere crond ignores files with dot. but new experience: crond needs to restarted or signalled. how this is done be

Re: [Pacemaker] changing cluster-ip

2013-07-04 Thread andreas graeper
pcs resource update p_clusterip ip=a.b.c.d OR crm resource edit p_clusterip ( vi .. :wq ) should work ?! 2013/7/4 Andrew Beekhof > > On 04/07/2013, at 8:37 PM, Leon Fauster > wrote: > > > Am 04.07.2013 um 12:02 schrieb andreas graeper >: > >> > >>

Re: [Pacemaker] changing cluster-ip

2013-07-04 Thread andreas graeper
/7/4 Leon Fauster > Am 04.07.2013 um 12:02 schrieb andreas graeper : > > > > i tried to change the IPaddr2 parameter ip > > > > 1) crm resource edit > > > > 2) pcs resource update = > > > > in both cases the cib is modified (`crm configure s

[Pacemaker] changing cluster-ip

2013-07-04 Thread andreas graeper
hi i tried to change the IPaddr2 parameter ip 1) crm resource edit (vi .. ) 2) pcs resource update = in both cases the cib is modified (`crm configure show` shows) but old ip only is pingable and all depending resources are stopped crm resource stop p_clusterip # does not stop that resourc

Re: [Pacemaker] corosync stop and consequences

2013-06-26 Thread andreas graeper
* > > Am Mittwoch, 26. Juni 2013, 16:36:43 schrieb andreas graeper: > > > hi and thanks. > > > a primitive can be moved to another node. how can i move (change roles) > of > > > drbd:master to the other node ? > > > > Switch off the other node. &

Re: [Pacemaker] corosync stop and consequences

2013-06-26 Thread andreas graeper
did not start. but the problem was not drbd, cause i could manually set it to primary. thanks in advance andreas 2013/6/25 Digimer > On 06/25/2013 07:29 AM, andreas graeper wrote: > > hi, > > maybe again and again the same question, please excuse. > > > > two nodes (

Re: [Pacemaker] corosync stop and consequences

2013-06-26 Thread andreas graeper
-name? is not lisel1 but this node (n2) is 'lisel1' i have deleted that location-constraint and corosync-stop+start again. did not help. 2013/6/25 andreas graeper > hi, > maybe again and again the same question, please excuse. > > two nodes (n1 active / n2 passive) and `ser

[Pacemaker] corosync stop and consequences

2013-06-25 Thread andreas graeper
hi, maybe again and again the same question, please excuse. two nodes (n1 active / n2 passive) and `service corosync stop` on active. does the node, that is going down, tells the other that he has gone, before he actually disconnect ? so that there is no reason for n2 to kill n1 ? on n2 after n1.

[Pacemaker] output crm_mon

2013-06-24 Thread andreas graeper
hi, crm_mon -rA1 shows : ClusterIP(ocf::heartbeat:IPaddr2):Started lisel1 Master/Slave Set: ms_drbd_r0 [p_drbd_r0] Masters: [ lisel1 ] Slaves: [ lisel2 ] p_lvm_r0(ocf::heartbeat:LVM):Started lisel1 workFS(ocf::heartbeat:Filesystem):Started lisel1 p_samba(l

[Pacemaker] drbd on passive node not started

2013-06-21 Thread andreas graeper
hi, n1 active node is started and everything works fine, but after reboot n2 drbd is not started by pacemaker. when i start drbd manually, crm_mon shows it as slave ( as if there were no problems). maybe someone experienced can have a look into logs ? thanks in advance andreas log.xz Descriptio

Re: [Pacemaker] known problem with corosync 1.4.1 on centos64 ?

2013-06-21 Thread andreas graeper
/ colocation ) resources, too ? unfortunately (fortunately ?!) the situation i could check this has gone. thanks andreas 2013/6/21 andreas graeper > hi, > > old version : > i shall maintain a centos63 with, except drbd (build from source), only > standard-repos are used. > for t

Re: [Pacemaker] known problem with corosync 1.4.1 on centos64 ?

2013-06-21 Thread andreas graeper
use with drbd+corosync+pacemaker. 2013/6/21 Lars Marowsky-Bree > On 2013-06-21T10:56:29, andreas graeper wrote: > > > hi, > > when only i remove or add resources, corosync starts to eat up all cpu. > > drbd 8.4.1 (build from source) > > corosync 1.4.1 > > ye

Re: [Pacemaker] re-manage a resource

2013-06-21 Thread andreas graeper
i found corosync still running after it stopped printing dots when i called `service corosync stop` a new call succeeded and all pacemaker-services finished, too. whats going on ?! 2013/6/21 andreas graeper > (in addition) > > i tried > pcs resource start ms_drbd # rc=0

Re: [Pacemaker] re-manage a resource

2013-06-21 Thread andreas graeper
ice, but not in this situation. thanks andreas 2013/6/21 andreas graeper > maybe i asked this before, but i could not find message + answer. > > when a resource gets unmanaged and the problems has gone, i want the > resource get managed by pacemaker again. what is to do ? > > s

[Pacemaker] re-manage a resource

2013-06-21 Thread andreas graeper
maybe i asked this before, but i could not find message + answer. when a resource gets unmanaged and the problems has gone, i want the resource get managed by pacemaker again. what is to do ? situation: only on node left (other ill) drbd could not get promoted now it is standalone,primary,uptod

[Pacemaker] known problem with corosync 1.4.1 on centos64 ?

2013-06-21 Thread andreas graeper
hi, when only i remove or add resources, corosync starts to eat up all cpu. drbd 8.4.1 (build from source) corosync 1.4.1 pacemaker 1.1.8 crmsh 1.2.5 (this from extra repo, cause crm is missing in pacemaker-cli ?! but it is not reason for trouble ! i use pcs except crm_mon ) pcs 0.9.26 when pc

[Pacemaker] after `corosync stop` on master, drbd:master moves to peer, but all other resource stopped

2013-06-18 Thread andreas graeper
hi, i stopped n1 (drbd:master + all managed resources) the n2 became drbd:master but all resources stopped. i started n1.corosync again and stopped once more and than n2 took over everything as expected. in log i found lots of messages ~ signing into stonith-ng failed (similiar) logs are saved, m

[Pacemaker] add / rmv resources

2013-06-18 Thread andreas graeper
hi, i use s_xxx.sh and k_xxx.sh scripts to create / remove resources xxx when after removing a resource, i call crm_mon, i can see lots of resources are stopped. little later they are started again. does a pcs resource stop xxx pcs resource delete xxx actually stops all resources ? and if, is

Re: [Pacemaker] how to get rid of an unmanaged orphaned resource

2013-06-18 Thread andreas graeper
the cleanup is doing the work. it need to be done on both (all) nodes ?! 2013/6/18 andreas graeper > thanks a lot ! > i just this minute came into such situation again. > > > 2013/6/18 Dejan Muhamedagic > >> Hi, >> >> On Tue, Jun 18, 2013 at 11:22:17AM +

Re: [Pacemaker] how to get rid of an unmanaged orphaned resource

2013-06-18 Thread andreas graeper
thanks a lot ! i just this minute came into such situation again. 2013/6/18 Dejan Muhamedagic > Hi, > > On Tue, Jun 18, 2013 at 11:22:17AM +0200, andreas graeper wrote: > > hi, > > i created a resource lsb:samba but the service script is 'smb' > > so i got

Re: [Pacemaker] resource removed (stop + delete) but still in cib-status

2013-06-18 Thread andreas graeper
hi, pacemaker is started as plugin from corosync but when i `service corosync stop` there are still pacemaker/lrmd pacemaker/pengine is this a problem / error ? thanks andreas 2013/6/18 andreas graeper > hi, > i started a resource lsb:samba what should have been lsb:smb >

[Pacemaker] resource removed (stop + delete) but still in cib-status

2013-06-18 Thread andreas graeper
hi, i started a resource lsb:samba what should have been lsb:smb now it is reported as an orphaned child i removed that resource but a look into cib shows after killing corosync and restart anyhow its still there. i killed corosync and pacemaker/lrmd

[Pacemaker] how to get rid of an unmanaged orphaned resource

2013-06-18 Thread andreas graeper
hi, i created a resource lsb:samba but the service script is 'smb' so i got an unmanged orphaned resource what i cannot simply delete. corosync stop does not work, too. thanks andreas ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.c

[Pacemaker] drbd-fence-by-handler

2013-06-17 Thread andreas graeper
hi, whats this ? thanks andreas ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Clust

Re: [Pacemaker] drbd connection

2013-06-17 Thread andreas graeper
2013/6/17 andreas graeper > hi, > i tried as i found in tutorial to kill -9 corosync on active node (n1). > but the other node (n2) > failed to demote drbd. after corosync start on n1, n2:drbd was left > unmanaged. > but /proc/drbd on both nodes looked good: connected and up

[Pacemaker] drbd connection

2013-06-17 Thread andreas graeper
hi, i tried as i found in tutorial to kill -9 corosync on active node (n1). but the other node (n2) failed to demote drbd. after corosync start on n1, n2:drbd was left unmanaged. but /proc/drbd on both nodes looked good: connected and uptodate. how in such situation a resource can get managed agai

Re: [Pacemaker] corosync suddenly eats up 100% cpu

2013-06-13 Thread andreas graeper
hi, after killing corosync on n2 (former master) all resources came up on n1 on n2 the whole file-system is readonly please help andreas 2013/6/13 andreas graeper > this does not happen the first time. > > i have two nodes. nothing more than brbd+ipaddr+filessystem+nfs > and i p

[Pacemaker] corosync suddenly eats up 100% cpu

2013-06-13 Thread andreas graeper
this does not happen the first time. i have two nodes. nothing more than brbd+ipaddr+filessystem+nfs and i played around with crm_attributes to get op-defaults. to find out why p_nfs_monitor_6 ( .. status=complete) not running and i tried to remove an old failed-action-report from crm_mon `

[Pacemaker] unmanaged resource

2013-06-13 Thread andreas graeper
hi, i use ocf:heartbeat to nfs-export the mounted /dev/drbd0 on drbd:master node. n1:master n2:slave n1 -> standby n2 takes over (well done) n1 reboot n1 online n2 standby now exportfs still started on n2 (unmanaged) FAILED what does started+unmanaged+failed mean in detail and how can i + get tha

Re: [Pacemaker] how do i remove a resource correct ?

2013-06-11 Thread andreas graeper
danke ! danke ! 2013/6/11 Michael Schwartzkopff > ** > > Am Dienstag, 11. Juni 2013, 13:39:42 schrieb andreas graeper: > > > hi, > > > what i do this moment: > > > > > > # installing > > > crm < > > configure primitive xxx &

[Pacemaker] /var/lib/pacemaker/cores/root does not exist

2013-06-11 Thread andreas graeper
hi, when crm node online|standby i get this error message : Cannot change active directory to /var/lib/pacemaker/cores/root: No such file or directory (2) ??? thanks andreas ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterl

Re: [Pacemaker] how do i remove a resource correct ?

2013-06-11 Thread andreas graeper
? 2013/6/11 Lars Marowsky-Bree > On 2013-06-11T13:39:42, andreas graeper wrote: > > > # removing > > cibadmin --delete --xml-text '' > > cibadmin --delete --xml-text '' > > Why not crm configure delete? > > > i.e. ocf:heartbeat:symlink

[Pacemaker] how do i remove a resource correct ?

2013-06-11 Thread andreas graeper
hi, what i do this moment: # installing crm <' cibadmin --delete --xml-text '' i.e. ocf:heartbeat:symlink (after installing 'failed actions: .. not installed' on passive node) when i remove the symlink-resource crm_mon tells about an orphaned child on active node ? thanks andreas __

Re: [Pacemaker] failed actions after resource creation

2013-06-11 Thread andreas graeper
417, rc=5, status=complete): not installed 1) the link gets created 2) the target cannot exists on drbd:slave, cause it is on /dev/drbd0 whats wrong ? what kind of check fails ? thanks andreas 2013/6/7 Andrew Beekhof > > On 07/06/2013, at 2:52 AM, andreas graeper > wrote: >

Re: [Pacemaker] corosync does not start

2013-06-10 Thread andreas graeper
; What do you have in /etc/security/limits.conf ? > > Thanks > > > 2013/6/10 andreas graeper > >> hi, >> Jun 10 15:09:06 n1 corosync[2785]: [MAIN ] Could not set SCHED_RR at >> priority 99: Operation not permitted (1) >> Jun 10 15:09:06 n1 corosync[2785]

Re: [Pacemaker] corosync does not start

2013-06-10 Thread andreas graeper
hi, service corosync start as root 2013/6/10 emmanuel segura > Hello Andreas > > Ho do you start the cluster ? > > Thanks > > > 2013/6/10 andreas graeper > >> hi, >> Jun 10 15:09:06 n1 corosync[2785]: [MAIN ] Could not set SCHED_RR at >

[Pacemaker] corosync does not start

2013-06-10 Thread andreas graeper
hi, Jun 10 15:09:06 n1 corosync[2785]: [MAIN ] Could not set SCHED_RR at priority 99: Operation not permitted (1) Jun 10 15:09:06 n1 corosync[2785]: [MAIN ] Could not lock memory of service to avoid page faults: Cannot allocate memory (12) Jun 10 15:09:06 n1 corosync[2785]: [MAIN ] Corosyn

Re: [Pacemaker] failed actions after resource creation

2013-06-06 Thread andreas graeper
06, 2013 at 02:28:45PM +0200, andreas graeper wrote: > > hi, > > in examples using crmsh to create a resource with constraints is an > > interactive mode is used > > > > crm configure edit > > > primitive B > > > orderB_after_A &g

Re: [Pacemaker] failed actions after resource creation

2013-06-06 Thread andreas graeper
ng ? is it calling `service xxx status` ? what does the monitor expect on node where service is running / not running ? thanks in advance andreas 2013/6/6 Florian Crouzat > Le 06/06/2013 15:49, andreas graeper a écrit : > > p_nfscommon_monitor_0 (node=linag, call=189, rc=5, >&

[Pacemaker] exportfs monitor interval="10"

2013-06-06 Thread andreas graeper
hi, monitor sends every 10 seconds a message to syslog can i force only bad messages to be logged ? thanks andreas ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.

Re: [Pacemaker] failed actions after resource creation

2013-06-06 Thread andreas graeper
linag' is Node1 (drbd:Slave) that cannot cleaned up : crm_resource -P crm_resource --cleanup --resource p_nfscommon what can i do ? thanks in advace andreas 2013/6/6 andreas graeper > hi, > in examples using crmsh to create a resource with constraints is an > interactive m

[Pacemaker] failed actions after resource creation

2013-06-06 Thread andreas graeper
hi, in examples using crmsh to create a resource with constraints is an interactive mode is used crm configure edit > primitive B > orderB_after_A > colocation B_on_A > commit > end quit when resource B depends on another resource A that is running once on Node0 (or as master

Re: [Pacemaker] owership of created symlink

2013-06-05 Thread andreas graeper
any reason to clone a nfs-server ? thanks in advance 2013/6/4 Lars Ellenberg > On Tue, Jun 04, 2013 at 07:15:11PM +0200, andreas graeper wrote: > > hi, > > i tried, before starting dovecot+exim+fetchmail > > > > to create a symlink > > /var/mail

[Pacemaker] owership of created symlink

2013-06-04 Thread andreas graeper
hi, i tried, before starting dovecot+exim+fetchmail to create a symlink /var/mail -> /mnt/mirror/var/mail with ra ocf:heartbeat:symlink i changed target : chmod 0775 chown root.mail but i need write permission to /var/mail cause exim wants to create a lock file i tried to manually chown -h

[Pacemaker] different os on nodes

2013-06-04 Thread andreas graeper
hi, target-system are identical machines with centos, but i am playing on debian-lubuntu to learn/test. debian: nfs-common + nfs-kernel-server ubuntu: nfs is there a way to handle such differences. or is it simply a bad idea to connect different systems. thanks in advance andreas

[Pacemaker] lsb resource manager

2013-06-04 Thread andreas graeper
hi, i am on debian7. nfs-kernel-server is not started cause there are no exports pacemaker does not realize this, tells nfs-common and nfs-kernel-server are started, but ocf:heartbeat:exportfs monitor is in fault-list. i changed nfs-kernel-server init script: if there is /etc/exports.d/force then

[Pacemaker] newbie mistake

2013-05-27 Thread andreas graeper
before i heard of drbd/pacemaker/corosync i logged into a master and nfs-exported a path (what was done by resource-manager before) when i realized the mistake, i cleaned /etc/exports after `exportfs -ua` and (i guess) anyhow the roles switched (i`m not sure). but now crm_mon tells about error on s