Re: [Pacemaker] [Problem] The attrd does not sometimes stop.

2011-12-21 Thread renayama19661014
Hi Dejan, Hi Lars, In our environment, the problem recurred with the patch of Mr. Lars. After a problem occurred, I sent TERM signal, but attrd does not seem to receive TERM at all. The reconsideration of the patch is necessary for the solution to problem. Best Regards, Hideo Yamauchi. --- On

Re: [Pacemaker] lrmd segfault

2011-12-21 Thread ruslan usifov
Also some times backtrace look like this: #0 0x7f2b3b9e0464 in __lll_lock_wait () from /lib/libpthread.so.0 #1 0x7f2b3b9db5d9 in _L_lock_953 () from /lib/libpthread.so.0 #2 0x7f2b3b9db3fb in pthread_mutex_lock () from /lib/libpthread.so.0 #3 0x7f2b3c2d3cf6 in g_main_context_fin

[Pacemaker] lrmd segfault

2011-12-21 Thread ruslan usifov
Hello I upgraded cluster from pacemaker 1.0.11 to pacemaker 1.1.6, and some times on all nodes lrmd will segfault with follow backtrace #0 0xb77f3430 in __kernel_vsyscall () #1 0xb73e6af9 in __lll_lock_wait () from /lib/tls/i686/cmov/libpthread.so.0 #2 0xb73e213b in _L_lock_748 () from /lib/tl

Re: [Pacemaker] OCFS2 problems when connectivity lost

2011-12-21 Thread Reid, Mike
Ivan, Can you post your configuration? Do you have STONITH enabled? I ran into similar issues in our Active/Active OCFS2/DRBD until we had a functioning STONITH, fwiw. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.or

Re: [Pacemaker] OCFS2 problems when connectivity lost

2011-12-21 Thread Florian Haas
2011/12/21 Ivan Savčić | Epix : > Hello, > > > We are having a problem with a 3-node cluster based on Pacemaker/Corosync > with 2 primary DRBD+OCFS2 nodes and a quorum node. > > Nodes run on Debian Squeeze, all packages are from the stable branch except > for Corosync (which is from backports for u

Re: [Pacemaker] OCFS2 problems when connectivity lost

2011-12-21 Thread Ivan Savčić | Epix
On 21.12.2011 13:07, Tim Serong wrote: My guess would be: The filesystem can't stop on the non-quorate node, because the network connection is down, so DLM can't do its thing. Ok. The filesystem is probably frozen on the quorate node, because of loss of DLM comms. Ok, same problem as abov

Re: [Pacemaker] OCFS2 problems when connectivity lost

2011-12-21 Thread Tim Serong
On 12/21/2011 09:47 PM, Ivan Savčić | Epix wrote: Hello, We are having a problem with a 3-node cluster based on Pacemaker/Corosync with 2 primary DRBD+OCFS2 nodes and a quorum node. Nodes run on Debian Squeeze, all packages are from the stable branch except for Corosync (which is from backport

[Pacemaker] OCFS2 problems when connectivity lost

2011-12-21 Thread Ivan Savčić | Epix
Hello, We are having a problem with a 3-node cluster based on Pacemaker/Corosync with 2 primary DRBD+OCFS2 nodes and a quorum node. Nodes run on Debian Squeeze, all packages are from the stable branch except for Corosync (which is from backports for udpu functionality). Each node has a sing

Re: [Pacemaker] Networking and routing issues with Active-Active

2011-12-21 Thread Arturo Borrero Gonzalez
Hi there! I'm working on two possible solutions for this. The first: In the primitive corresponding the IPv4 assigned to the loopback interface on the "non-doing-balancing" node, change the cidr_netmask parameter to "32". This way, the local route table of each node is modified and the machine w

Re: [Pacemaker] Feature request: cleanup resource on primitive definition change

2011-12-21 Thread Rasto Levrinc
On Wed, Dec 21, 2011 at 7:38 AM, Vladislav Bogdanov wrote: > 21.12.2011 09:11, Rasto Levrinc wrote: >> On Wed, Dec 21, 2011 at 5:24 AM, Vladislav Bogdanov >> wrote: >>> 21.12.2011 06:21, Andrew Beekhof wrote: On Tue, Dec 13, 2011 at 11:32 PM, Vladislav Bogdanov wrote: > Hi Andrew,