On 11/14/2011 02:19 PM, ihjaz Mohamed wrote: > nope. Am not using stonith.
Highly recommended -- and a must have if shared storage is in use -- for every pacemaker cluster ... since IPMI is available with most of the current serverhardware no extra effort beside pacemaker configuration is necessary. Regards, Andreas -- Need help with Pacemaker? http://www.hastexo.com/now > > ------------------------------------------------------------------------ > *From:* Andreas Kurz <andr...@hastexo.com> > *To:* pacemaker@oss.clusterlabs.org > *Sent:* Monday, 14 November 2011 6:08 PM > *Subject:* Re: [Pacemaker] killing corosync leaves crmd, stonithd, lrmd, > cib and attrd to hog up the cpu > > On 11/14/2011 12:32 PM, ihjaz Mohamed wrote: >> Hi All, >> >> As part of some robustness test for my cluster, I tried killing the >> corosync process using kill -9 <pid>. After this I see that the >> pacemakerd service is stopped but the processes crmd, stonithd, lrmd, >> cib and attrd are still running and are hogging up the cpu. > > Then fix your stonith setup if you want a "robust" cluster setup .... of > course you are using stonith, aren't you? > > Regards, > Andreas > > -- > Need help with Pacemaker? > http://www.hastexo.com/now > >> >> >> top - 06:26:51 up 2:01, 4 users, load average: 12.04, 12.01, 11.98 >> Tasks: 330 total, 13 running, 317 sleeping, 0 stopped, 0 zombie >> Cpu(s): 7.1%us, 17.1%sy, 0.0%ni, 75.6%id, 0.1%wa, 0.0%hi, 0.0%si, >> 0.0%st >> Mem: 8015444k total, 4804412k used, 3211032k free, 54800k buffers >> Swap: 10256376k total, 0k used, 10256376k free, 1604464k cached >> >> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >> 2053 hacluste RT 0 90492 3324 2476 R 100.0 0.0 113:40.61 crmd >> 2047 root RT 0 81480 2108 1712 R 99.8 0.0 113:40.43 stonithd >> 2048 hacluste RT 0 83404 5260 2992 R 99.8 0.1 113:40.90 cib >> 2050 hacluste RT 0 85896 2388 1952 R 99.8 0.0 113:40.43 attrd >> 5018 root 20 0 8787m 345m 56m S 2.0 4.4 0:56.95 java >> 19017 root 20 0 15068 1252 796 R 2.0 0.0 0:00.01 top >> 1 root 20 0 19232 1444 1156 S 0.0 0.0 0:01.71 init >> 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd >> 3 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0 >> 4 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0 >> >> >> Is there a way to cleanup these processes ? OR Do I need to kill them >> one by one before respawning the corosync? >> >> >> >> _______________________________________________ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > <mailto:Pacemaker@oss.clusterlabs.org> >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > > > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > <mailto:Pacemaker@oss.clusterlabs.org> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > > > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker