Re: [Pacemaker] DRBD+LVM+NFS problems

2013-03-25 Thread Vladislav Bogdanov
26.03.2013 04:23, Dennis Jacobfeuerborn wrote: > I have now reduced the configuration further and removed LVM from the > picture. Still the cluster fails when I set the master node to standby. > What's interesting is that things get fixed when I issue a simple > "cleanup" for the filesystem resourc

Re: [Pacemaker] Improvement for the communication failure of booth

2013-03-25 Thread yusuke iida
Hi, Jiaju A reply becomes slow, and I'm sorry. 2013/2/12 Jiaju Zhang : > Hi Yusuke, > > > Just look at the patch, it seems to me that it wanted to differentiate > every state like "init", "waiting promise", "promised", "waiting accept" > and "accepted", etc ... However I'm afraid in this way, it

Re: [Pacemaker] DRBD+LVM+NFS problems

2013-03-25 Thread Dennis Jacobfeuerborn
I have now reduced the configuration further and removed LVM from the picture. Still the cluster fails when I set the master node to standby. What's interesting is that things get fixed when I issue a simple "cleanup" for the filesystem resource. This is what my current config looks like: node

Re: [Pacemaker] pacemaker node stuck offline

2013-03-25 Thread Andreas Kurz
On 2013-03-22 03:39, pacema...@feystorm.net wrote: > > On 03/21/2013 11:15 AM, Andreas Kurz wrote: >> On 2013-03-21 14:31, Patrick Hemmer wrote: >>> I've got a 2-node cluster where it seems last night one of the nodes >>> went offline, and I can't see any reason why. >>> >>> Attached are the logs

Re: [Pacemaker] issues when installing on pxe booted environment

2013-03-25 Thread Andreas Kurz
On 2013-03-22 19:31, John White wrote: > Hello Folks, > We're trying to get a corosync/pacemaker instance going on a 4 node > cluster that boots via pxe. There have been a number of state/file system > issues, but those appear to be *mostly* taken care of thus far. We're > running into a

Re: [Pacemaker] Resource is Too Active (on both nodes)

2013-03-25 Thread Andreas Kurz
On 2013-03-22 21:35, Mohica Jasha wrote: > Hey, > > I have two cluster nodes. > > I have a service process which is prone to crash and takes a very long > time to start. > Since the service process takes a long time to start I have the service > process running on both nodes, but only the active

Re: [Pacemaker] OCF Resource agent promote question

2013-03-25 Thread Andreas Kurz
Hi Steve, On 2013-03-25 18:44, Steven Bambling wrote: > All, > > I'm trying to work on a OCF resource agent that uses postgresql > streaming replication. I'm running into a few issues that I hope might > be answered or at least some pointers given to steer me in the right > direction. Why are y

Re: [Pacemaker] stonith and avoiding split brain in two nodes cluster

2013-03-25 Thread Jacek Konieczny
On Mon, 25 Mar 2013 20:01:28 +0100 "Angel L. Mateo" wrote: > >quorum { > > provider: corosync_votequorum > > expected_votes: 2 > > two_node: 1 > >} > > > >Corosync will then manage quorum for the two-node cluster and > >Pacemaker > > I'm using corosync 1.1 which is the one provided

Re: [Pacemaker] solaris problem

2013-03-25 Thread Andrei Belov
Andreas, thank you for sharing this link and your start script! My goal is to make possible building those tools using more convenient way of NetBSD's pkgsrc system. Perhaps using something like --localstatedir=${VARBASE}/cluster for both libqb, corosync and pacemaker, and setting the appropria

Re: [Pacemaker] stonith and avoiding split brain in two nodes cluster

2013-03-25 Thread Angel L. Mateo
Jacek Konieczny escribió: >On Mon, 25 Mar 2013 13:54:22 +0100 >> My problem is how to avoid split brain situation with this >> configuration, without configuring a 3rd node. I have read about >> quorum disks, external/sbd stonith plugin and other references, but >> I'm too confused with a

[Pacemaker] OCF Resource agent promote question

2013-03-25 Thread Steven Bambling
All, I'm trying to work on a OCF resource agent that uses postgresql streaming replication. I'm running into a few issues that I hope might be answered or at least some pointers given to steer me in the right direction. 1. A quick way of obtaining a list of "Online" nodes in the cluster that

Re: [Pacemaker] solaris problem

2013-03-25 Thread LGL Extern
Andrei There is no need to make this change. I described in http://grueni.github.com/libqb/ how I compiled libqb and the other programs. LOCALSTATEDIR should be defined with ./configure. Please look a "Compile Corosync" in my description. I guess your start scripts should be changed. We use

Re: [Pacemaker] stonith and avoiding split brain in two nodes cluster

2013-03-25 Thread Jacek Konieczny
On Mon, 25 Mar 2013 13:54:22 +0100 > My problem is how to avoid split brain situation with this > configuration, without configuring a 3rd node. I have read about > quorum disks, external/sbd stonith plugin and other references, but > I'm too confused with all this. > > For example, [

Re: [Pacemaker] solaris problem

2013-03-25 Thread Andrei Belov
Ok, I fixed this issue with the following patch against libqb 0.14.4: --- lib/unix.c.orig 2013-03-25 12:30:50.445762231 + +++ lib/unix.c 2013-03-25 12:49:59.322276376 + @@ -83,7 +83,7 @@ #if defined(QB_LINUX) || defined(QB_CYGWIN) snprintf(path, PATH_MAX, "/dev/shm/%

Re: [Pacemaker] stonith and avoiding split brain in two nodes cluster

2013-03-25 Thread emmanuel segura
I have a production cluster, using two vm on esx cluster, for stonith i'm using sbd, everything work find 2013/3/25 Angel L. Mateo > Hello, > > I am newbie with pacemaker (and, generally, with ha clusters). I > have configured a two nodes cluster. Both nodes are virtual machines > (vmwar

Re: [Pacemaker] stonith and avoiding split brain in two nodes cluster

2013-03-25 Thread emmanuel segura
I have a production cluster, using two vm on esx cluster, for stonith i'm using sbd, everything work fine 2013/3/25 emmanuel segura > I have a production cluster, using two vm on esx cluster, for stonith i'm > using sbd, everything work find > > 2013/3/25 Angel L. Mateo > >> Hello, >> >>

[Pacemaker] stonith and avoiding split brain in two nodes cluster

2013-03-25 Thread Angel L. Mateo
Hello, I am newbie with pacemaker (and, generally, with ha clusters). I have configured a two nodes cluster. Both nodes are virtual machines (vmware esx) and use a shared storage (provided by a SAN, although access to the SAN is from esx infrastructure and VM consider it as scsi disk). I have

Re: [Pacemaker] racing crm commands... last write wins?

2013-03-25 Thread Dejan Muhamedagic
On Wed, Mar 20, 2013 at 10:40:10AM -0700, Bob Haxo wrote: > Regarding the replace triggering a DC election ... which is causing > issues with scripted installs ... how do I determine which crm commands > will NOT trigger this election? It seems like every "configure commit" could possible result i

Re: [Pacemaker] DRBD+LVM+NFS problems

2013-03-25 Thread Dennis Jacobfeuerborn
I just found the following in the dmesg output which might or might not add to understanding the problem: device-mapper: table: 253:2: linear: dm-linear: Device lookup failed device-mapper: ioctl: error adding target to table Regards, Dennis On 25.03.2013 13:04, Dennis Jacobfeuerborn wrote:

[Pacemaker] DRBD+LVM+NFS problems

2013-03-25 Thread Dennis Jacobfeuerborn
Hi, I'm currently trying create a two node redundant NFS setup on CentOS 6.4 using pacemaker and crmsh. I use this Document as a starting poing: https://www.suse.com/documentation/sle_ha/singlehtml/book_sleha_techguides/book_sleha_techguides.html The first issue is that using these instruction

Re: [Pacemaker] solaris problem

2013-03-25 Thread Andrei Belov
I've rebuilt libqb using separated SOCKETDIR (/var/run/qb), and set hacluster:haclient ownership to this dir. After that pacemakerd has been successfully started with all its childs: [root@ha1 /var/run/qb]# pacemakerd -fV Could not establish pacemakerd connection: Connection refused (146) i

Re: [Pacemaker] Linking lib/cib and lib/pengine to each other?

2013-03-25 Thread Viacheslav Dubrovskyi
23.03.2013 08:27, Viacheslav Dubrovskyi пишет: > Hi. > > I'm building a package for my distributive. Everything is built, but the > package does not pass our internal tests. I get errors like this: > verify-elf: ERROR: ./usr/lib/libpe_status.so.4.1.0: undefined symbol: > get_object_root > > It mean

Re: [Pacemaker] CMAN, corosync & pacemaker

2013-03-25 Thread Lars Marowsky-Bree
On 2013-03-21T15:28:17, Leon Fauster wrote: > > I believe the preferred pacemaker based HA configuration in RHEL 6.4 uses > > all three packages and the preferred configuration in SLES11 SP2 is just > > corosync/pacemaker (I do not believe CMAN is even available in SLE-HAE). > > Why the differe

Re: [Pacemaker] solaris problem

2013-03-25 Thread Andrei Belov
Andreas, just tried "PCMK_ipc_type=socket pacemaker -fV" - a bunch of additional "event_send" errors appeared: Mar 25 11:15:55 [33641] ha1 corosync error [MAIN ] event_send retuned -32, expected 256! Mar 25 11:15:55 [33641] ha1 corosync error [SERV ] event_send retuned -32, expected 217!

Re: [Pacemaker] solaris problem

2013-03-25 Thread LGL Extern
With solaris/openindiana you should use this setting export PCMK_ipc_type=socket Andreas -Ursprüngliche Nachricht- Von: Andrei Belov [mailto:defana...@gmail.com] Gesendet: Montag, 25. März 2013 10:43 An: pacemaker@oss.clusterlabs.org Betreff: [Pacemaker] solaris problem Hi folks, I'm

[Pacemaker] Patrik Rapposch is out of the office

2013-03-25 Thread Patrik . Rapposch
Ich werde ab 25.03.2013 nicht im Büro sein. Ich kehre zurück am 27.03.2013. Sehr geehrte Damen und Herren, ich bin bis einschließlich 27.03 auf Dienstreise. Trotzdem versuche ich Ihr Anliegen so schnell als möglich zu beantworten. Bitte setzen Sie immer ksi.network in Kopie. Please note, that I

[Pacemaker] solaris problem

2013-03-25 Thread Andrei Belov
Hi folks, I'm trying to build test HA cluster on Solaris 5.11 using libqb 0.14.4, corosync 2.3.0 and pacemaker 1.1.8, and I'm facing a strange problem while starting pacemaker. Log shows the following errors: Mar 25 09:21:26 [33720] lrmd:error: mainloop_add_ipc_server: Could not sta