Re: [Pacemaker] kernel.core_uses_pid and ulimit -c

2009-06-03 Thread Florian Haas
On 06/04/2009 08:42 AM, Andrew Beekhof wrote: > On Thu, Jun 4, 2009 at 8:40 AM, Florian Haas wrote: >> Andrew, Dejan et al., >> >> The TODO page at http://clusterlabs.org/wiki/TODO states that Pacemaker >> now automagically sets the kernel.core_uses_pid sysctl to ease >> debugging. Wouldn't it mak

[Pacemaker] pingd comments and metadata

2009-06-03 Thread Florian Haas
Andrew, Dejan, Dominik, I am by no means a pingd expert, but the current incarnation in stable-1.0 seems to have some outdated and misleading comments and meta data. Examples: The list of ping nodes to count. Defaults to all configured ping nodes. Rarely needs to be specified. Host list D

Re: [Pacemaker] kernel.core_uses_pid and ulimit -c

2009-06-03 Thread Andrew Beekhof
On Thu, Jun 4, 2009 at 8:40 AM, Florian Haas wrote: > Andrew, Dejan et al., > > The TODO page at http://clusterlabs.org/wiki/TODO states that Pacemaker > now automagically sets the kernel.core_uses_pid sysctl to ease > debugging. Wouldn't it make sense to check the currently set "ulimit -c" > too,

[Pacemaker] kernel.core_uses_pid and ulimit -c

2009-06-03 Thread Florian Haas
Andrew, Dejan et al., The TODO page at http://clusterlabs.org/wiki/TODO states that Pacemaker now automagically sets the kernel.core_uses_pid sysctl to ease debugging. Wouldn't it make sense to check the currently set "ulimit -c" too, and at least issue a warning message on startup if that ulimit

Re: [Pacemaker] Erro messages

2009-06-03 Thread Andrew Beekhof
Its fixed for 1.0.4 which should be out later today (depending on when my power gets turned back on) On Wed, Jun 3, 2009 at 5:40 PM, Infos E-Blokos wrote: > Hi Andrew, > > I saw on archive (may 4th) that > a guy had already a problem of error messages like this > > Jun 3 03:43:44 server-9 crmd: [

Re: [Pacemaker] Do we have a repository for config files?

2009-06-03 Thread Andrew Beekhof
On Wed, Jun 3, 2009 at 9:40 PM, Shaffin Bhanji wrote: > Hello, > > I am new to this list but do we have a repository of config files > (resources) that enable various HA capabilities yet? Do you mean the scripts or xml fragments that go in the cib? -- Andrew

Re: [Pacemaker] Stonith driver

2009-06-03 Thread Infos E-Blokos
depenhd what you want... - Original Message - From: acl1978 To: pacemaker@oss.clusterlabs.org Sent: Wednesday, June 03, 2009 5:11 PM Subject: [Pacemaker] Stonith driver Hi everybody, How can I know the correct stonith driver to use? Thanks, Alan -- Thi

[Pacemaker] Stonith driver

2009-06-03 Thread acl1978
Hi everybody,   How can I know the correct stonith driver to use?   Thanks, Alan ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Re: [Pacemaker] reliable way to cib SEGFAULT -- how is cibadmin -Q --xpath supposed to work?

2009-06-03 Thread Andrew Beekhof
fixed: http://hg.clusterlabs.org/pacemaker/stable-1.0/rev/cf478ed1269f sorry about that On Wed, Jun 3, 2009 at 6:10 PM, Lars Ellenberg wrote: > > current mercurial pacemaker stable-1.0 > > do > cibadmin -Q --xpath //@id > and watch your cib segfault: > WARN: Managed /usr/lib/heartbeat/cib pro

Re: [Pacemaker] Resource running on more than one node simultaneously

2009-06-03 Thread Andrew Beekhof
Check out clones. On Wed, Jun 3, 2009 at 8:46 PM, George Gomes wrote: > Guys, > >   Is it possible to have one resource running on more than one node at the > same time. I mean, heartbeat must control the execution of one resource > (bash script) on more than one node. > Example.: > - My cluster

[Pacemaker] Do we have a repository for config files?

2009-06-03 Thread Shaffin Bhanji
Hello, I am new to this list but do we have a repository of config files (resources) that enable various HA capabilities yet? Thanks, Shaffin. ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemake

[Pacemaker] Resource running on more than one node simultaneously

2009-06-03 Thread George Gomes
Guys, Is it possible to have one resource running on more than one node at the same time. I mean, heartbeat must control the execution of one resource (bash script) on more than one node. Example.: - My cluster environment has two nodes A and B; - The heartbeat must assure that process X is runn

Re: [Pacemaker] reliable way to cib SEGFAULT -- how is cibadmin -Q --xpath supposed to work?

2009-06-03 Thread Andrew Beekhof
On Wed, Jun 3, 2009 at 6:10 PM, Lars Ellenberg wrote: > > current mercurial pacemaker stable-1.0 > > do > cibadmin -Q --xpath //@id > and watch your cib segfault: > WARN: Managed /usr/lib/heartbeat/cib process 15295 killed by signal 11 > [SIGSEGV - Segmentation violation] crap :-( try something

[Pacemaker] reliable way to cib SEGFAULT -- how is cibadmin -Q --xpath supposed to work?

2009-06-03 Thread Lars Ellenberg
current mercurial pacemaker stable-1.0 do cibadmin -Q --xpath //@id and watch your cib segfault: WARN: Managed /usr/lib/heartbeat/cib process 15295 killed by signal 11 [SIGSEGV - Segmentation violation] :) (gdb) bt #0 0x40280a8b in xmlAddChild () from /usr/lib/libxml2.so.2 #1 0x40031828 in a

Re: [Pacemaker] cibadmin doesn't change cib.xml

2009-06-03 Thread Infos E-Blokos
- Original Message - From: "Eliot Gable" To: "pacemaker@oss.clusterlabs.org" Sent: Wednesday, June 03, 2009 11:51 AM Subject: Re: [Pacemaker] cibadmin doesn't change cib.xml Try: crm_verify -V -x newcib.xml and make sure it verifies OK. Then do: cibadmin -R -o cib -x newcib.

Re: [Pacemaker] cibadmin doesn't change cib.xml

2009-06-03 Thread Eliot Gable
You can always check. Probably look at /var/lib/heartbeat and everything under it if you are using Heartbeat. If OpenAIS, not sure where to look. Eliot Gable Senior Engineer 1228 Euclid Ave, Suite 390 Cleveland, OH 44115 Direct: 216-373-4808 Fax: 216-373-4657 ega...@broadvox.net CONFIDENTIAL

Re: [Pacemaker] cibadmin doesn't change cib.xml

2009-06-03 Thread Eliot Gable
Try: crm_verify -V -x newcib.xml and make sure it verifies OK. Then do: cibadmin -R -o cib -x newcib.xml After doing that, try: cibadmin -Q | less And check to see if it has the new CIB. If that doesn't work, post your CIB. Eliot Gable Senior Engineer 1228 Euclid Ave, Suite 390 Clevela

Re: [Pacemaker] cibadmin doesn't change cib.xml

2009-06-03 Thread Infos E-Blokos
- Original Message - From: "Dejan Muhamedagic" To: Sent: Wednesday, June 03, 2009 4:51 AM Subject: Re: [Pacemaker] cibadmin doesn't change cib.xml On Tue, Jun 02, 2009 at 07:32:37PM -0400, Infos E-Blokos wrote: Hi, when I try to modify cib.xml with "cibadmin -S -x newcib.xml" not

Re: [Pacemaker] cibadmin doesn't change cib.xml

2009-06-03 Thread Infos E-Blokos
- Original Message - From: "Dejan Muhamedagic" To: Sent: Wednesday, June 03, 2009 4:51 AM Subject: Re: [Pacemaker] cibadmin doesn't change cib.xml On Tue, Jun 02, 2009 at 07:32:37PM -0400, Infos E-Blokos wrote: Hi, when I try to modify cib.xml with "cibadmin -S -x newcib.xml" not

Re: [Pacemaker] cibadmin doesn't change cib.xml

2009-06-03 Thread Infos E-Blokos
- Original Message - From: "Dejan Muhamedagic" To: Sent: Wednesday, June 03, 2009 4:51 AM Subject: Re: [Pacemaker] cibadmin doesn't change cib.xml On Tue, Jun 02, 2009 at 07:32:37PM -0400, Infos E-Blokos wrote: Hi, when I try to modify cib.xml with "cibadmin -S -x newcib.xml" not

[Pacemaker] Erro messages

2009-06-03 Thread Infos E-Blokos
Hi Andrew, I saw on archive (may 4th) that a guy had already a problem of error messages like this Jun 3 03:43:44 server-9 crmd: [3476]: ERROR: do_fsa_action: Action A_CL_JOIN_RESULT took 814606728s to complete Jun 3 03:43:44 server-9 crmd: [3476]: ERROR: do_fsa_action: Action A_LOG took 8146

Re: [Pacemaker] System Health backend part

2009-06-03 Thread Eliot Gable
I actually do start pingd on just one node and fail it over. It won't work on my slave node because the slave node does not have Internet access, only local cluster access. If it ran all the time on that node, it would always show Internet connectivity down. Thus, I must agree with Andrew: Pacem

Re: [Pacemaker] System Health backend part

2009-06-03 Thread Andrew Beekhof
On Tue, Jun 2, 2009 at 11:35 PM, Mark Hamzy wrote: > and...@beekhof.net wrote on 06/02/2009 16:46:55 PM: > >> Do you think this should live in pacemaker or with the RAs? >> I'm inclined to think the latter but am open to persuasion. > > Well, I think that these files do not fit within the Resource

Re: [Pacemaker] trigger STONITH for testing purposes

2009-06-03 Thread Andrew Beekhof
2009/6/3 Yan Gao : > On Wed, 2009-06-03 at 09:26 +0200, Andrew Beekhof wrote: >> 2009/6/3 Yan Gao : >> > Andrew, >> > If we execute crm_mon without "-r", the resources have ever been running >> > on the uncleanly offline node will be hidden. >> >> Even when stonith-enabled is set to true? > Yes, w

Re: [Pacemaker] cibadmin doesn't change cib.xml

2009-06-03 Thread Dejan Muhamedagic
On Tue, Jun 02, 2009 at 07:32:37PM -0400, Infos E-Blokos wrote: > Hi, > > when I try to modify cib.xml with "cibadmin -S -x newcib.xml" nothing is > changed. Do you get an error message? BTW, -S is not the right option, try -R. Thanks, Dejan > - 4 nodes > - Fedora 10 > - heartbeat 2.99-8 > -

Re: [Pacemaker] trigger STONITH for testing purposes

2009-06-03 Thread Yan Gao
On Wed, 2009-06-03 at 09:26 +0200, Andrew Beekhof wrote: > 2009/6/3 Yan Gao : > > On Fri, 2009-05-22 at 12:33 +0200, Andrew Beekhof wrote: > >> On Wed, May 20, 2009 at 6:39 PM, Bob Haxo wrote: > >> > Hi Andrew, > >> > > >> > I'd say you removed no-quorum-policy=ignore > >> > > >> > Actually, the p

Re: [Pacemaker] trigger STONITH for testing purposes

2009-06-03 Thread Andrew Beekhof
2009/6/3 Yan Gao : > On Fri, 2009-05-22 at 12:33 +0200, Andrew Beekhof wrote: >> On Wed, May 20, 2009 at 6:39 PM, Bob Haxo wrote: >> > Hi Andrew, >> > >> > I'd say you removed no-quorum-policy=ignore >> > >> > Actually, the pair of no_quorum_policy and no-quorum-policy are set to >> > "ignore", an