Re: [Pacemaker] Fw: new message

2016-05-03 Thread Digimer
er cluster resource manager > > > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pd

Re: [Pacemaker] Pacemaker license

2015-10-05 Thread Digimer
On 05/10/15 06:36 PM, santosh_bidara...@dell.com wrote: > *Dell - Internal Use - Confidential * I'll leave the author and/or RH legal answer the overall question. My concern is that you posted to a public mailing list (and archive) with that header in your email... -- Digimer Pa

Re: [Pacemaker] Fencing gone bad, sometimes...

2015-09-10 Thread Digimer
te the logs from both nodes, starting just before the fence call is made until the logs stop updating. -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? ___

Re: [Pacemaker] [HA] RFC: moving Pacemaker openstack-resource-agents to stackforge

2015-06-23 Thread Digimer
s on the move and defer CI work till later. > > Thoughts? > > Thanks! > Adam > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Hom

Re: [Pacemaker] [HA] RFC: moving Pacemaker openstack-resource-agents to stackforge

2015-06-23 Thread Digimer
d defer CI work till later. > > Thoughts? > > Thanks! > Adam > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Ge

Re: [Pacemaker] principal questions to a two-node cluster - Reply on ClusterLabs, not here (my bad)

2015-04-20 Thread Digimer
Ignore this, I re-sent to ClusterLabs. On 20/04/15 09:36 AM, Digimer wrote: > On 20/04/15 08:29 AM, Lentes, Bernd wrote: >> Hi, >> >> we'd like to create a two-node cluster for our services (web, database, >> virtual machines). We will have two servers and a s

Re: [Pacemaker] principal questions to a two-node cluster

2015-04-20 Thread Digimer
tter for migration; No 'stop' needed, live-migration causes no interruption). Recovery from a crashed/failed active; Fence the lost node -> Connect the LUN -> mount the FS -> start the services -> take a floating/virtual IP. To get into anything more specific, you will need to

Re: [Pacemaker] Cluster with two STONITH devices

2015-04-08 Thread Digimer
pected. > > But with the IPMI STONITH resource started, I notice an erratic > behavior: > - Some times, the resources at the node cluster-a-1 are stopped and > no STONITH happens. Also, the resources are not moved to the node > cluster-a-2. In t

Re: [Pacemaker] Suggestions for managing HA of containers from within a Pacemaker container?

2015-02-25 Thread Digimer
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 25/02/15 09:39 PM, Andrew Beekhof wrote: > >> On 26 Feb 2015, at 8:51 am, Digimer wrote: >> > On 25/02/15 04:45 PM, David Vossel wrote: >>>> >>>> >>>> - Original Message - >

Re: [Pacemaker] Suggestions for managing HA of containers from within a Pacemaker container?

2015-02-25 Thread Digimer
nt, but to simply assume so at this stage and to base the future development of HA software on that assumption seems ... risky. I think there is an argument for supporting containers, sure, but not to base plans on the assumption that they will become the be-all and end-all. - -- Digimer Papers an

Re: [Pacemaker] postgresql never promoted

2015-02-20 Thread Digimer
start pri_vip (score:0) (non-symmetrical) Colocation Constraints: > pri_vip with ms_pgsql (score:INFINITY) (rsc-role:Started) > (with-rsc-role:Master) > > I am now out of ideas so any help is very much appreciated. > > Regards. > > > ___

Re: [Pacemaker] Two node cluster and no hardware device for stonith.

2015-02-05 Thread Digimer
automatic failover. digimer On 05/02/15 03:38 AM, Dmitry Koterov wrote: Could you please give a hint: how to use fencing in case the nodes are all in different geo-distributed datacenters? How people do that? Because there could be a network disconnection between datacenters, and we have no chance to

Re: [Pacemaker] HA Summit Key-signing Party

2015-02-02 Thread Digimer
inter available in the room/area of the summit? If so, it might be good to set aside a bit of time to help people new to PGP get setup before the actual key-signing. -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without acce

Re: [Pacemaker] Two node cluster and no hardware device for stonith.

2015-02-02 Thread Digimer
list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for c

Re: [Pacemaker] [Linux-HA] [Planning] Organizing HA Summit 2015

2015-01-30 Thread Digimer
On 30/01/15 08:34 AM, Dejan Muhamedagic wrote: Hi Digimer, On Tue, Jan 13, 2015 at 12:31:22AM -0500, Digimer wrote: Hi all, With Fabio away for now, I (and others) are working on the final preparations for the summit. This is your chance to speak up and influence the planning! Objections

Re: [Pacemaker] HA Summit Key-signing Party

2015-01-26 Thread Digimer
On 26/01/15 09:14 AM, Jan Pokorný wrote: Hello cluster masters, On 13/01/15 00:31 -0500, Digimer wrote: Any concerns/comments/suggestions, please speak up ASAP! I'd like to throw a key-signing party as it will be a perfect opportunity to build a web of trust amongst us. If you ha

Re: [Pacemaker] Two node cluster and no hardware device for stonith.

2015-01-21 Thread Digimer
KVM/Xen systems, fence_vmware for VMWare, etc). -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? ___ Pacemaker mailing list: Pacemaker@oss.clusterla

Re: [Pacemaker] [Linux-HA] [ha-wg-technical] [RFC] Organizing HA Summit 2015

2015-01-13 Thread Digimer
Woohoo!! Will be very nice to see you. :) I've added you. Can you give me a short sentence to introduce yourself to people who haven't met you? Madi On 13/01/15 11:33 PM, Yusuke Iida wrote: Hi Digimer, I am Iida to participate from NTT along with Mori. I want you added to t

[Pacemaker] [Planning] Organizing HA Summit 2015

2015-01-12 Thread Digimer
). Those staying in Brno are more than welcome to join an informal dinner and drinks (and possibly some sight-seeing, etc) the evening of the 5th. Any concerns/comments/suggestions, please speak up ASAP! -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped

[Pacemaker] HA Summit 2015 - plan wiki closed for registration

2015-01-11 Thread Digimer
Spammers got through the captcha, *sigh*. If anyone wants to create an account to edit, please email me off-list and I'll get you setup ASAP. Sorry for the hassle. http://plan.alteeve.ca/index.php/Main_Page -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for canc

Re: [Pacemaker] Pacemaker + drbd + Cman Error: gfs_controld join connect error: Connection refused error mounting lockproto lock_dlm

2015-01-04 Thread Digimer
b-bootstrap-options" \ dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \ cluster-infrastructure="cman" \ expected-quorum-votes="2" \ stonith-enabled="false" \ no-quorum-policy="ignore" \ last-lrm-refresh=

Re: [Pacemaker] Pacemaker + drbd + Cman Error: gfs_controld join connect error: Connection refused error mounting lockproto lock_dlm

2015-01-02 Thread Digimer
_ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org -- Digimer Papers and

Re: [Pacemaker] [ha-wg-technical] [RFC] Organizing HA Summit 2015

2014-12-22 Thread Digimer
It will be very nice to see you again! Will Ikeda-san be there as well? digimer On 22/12/14 03:35 AM, Keisuke MORI wrote: Hi all, Really late response but, I will be joining the HA summit, with a few colleagues from NTT. See you guys in Brno, Thanks, 2014-12-08 22:36 GMT+09:00 Jan Pokorný

Re: [Pacemaker] Where is Beekhof?

2014-12-14 Thread Digimer
le to deal with anything except the lowest of the low hanging fruit Things should return to normal in time to be disrupted by the cluster pow-wow in Brno. -- Andrew Fabio doesn't want to go down with the OS ship on his own? Prepare your liver... ;) -- Digimer Papers and Projects: https

Re: [Pacemaker] [ha-wg-technical] Wiki for planning created - Re: [RFC] Organizing HA Summit 2015

2014-11-29 Thread Digimer
On 27/11/14 11:52 AM, Digimer wrote: I just created a dedicated/fresh wiki for planning and organizing: http://plan.alteeve.ca/index.php/Main_Page Other than the domain, it has no association with any existing project, so it should be a neutral enough platform. Also, it's not own

Re: [Pacemaker] [Cluster-devel] Wiki for planning created - Re: [RFC] Organizing HA Summit 2015

2014-11-28 Thread Digimer
On 29/11/14 12:45 AM, Fabio M. Di Nitto wrote: On 11/28/2014 8:10 PM, Jan Pokorný wrote: On 28/11/14 00:37 -0500, Digimer wrote: On 28/11/14 12:33 AM, Fabio M. Di Nitto wrote: On 11/27/2014 5:52 PM, Digimer wrote: I just created a dedicated/fresh wiki for planning and organizing: http

Re: [Pacemaker] [Cluster-devel] Wiki for planning created - Re: [RFC] Organizing HA Summit 2015

2014-11-27 Thread Digimer
On 28/11/14 12:33 AM, Fabio M. Di Nitto wrote: On 11/27/2014 5:52 PM, Digimer wrote: I just created a dedicated/fresh wiki for planning and organizing: http://plan.alteeve.ca/index.php/Main_Page Other than the domain, it has no association with any existing project, so it should be a

[Pacemaker] Wiki for planning created - Re: [RFC] Organizing HA Summit 2015

2014-11-27 Thread Digimer
ng with captchas because, in my experience, spammer walk right through them anyway. I do have edits email me, so I can catch and roll back any spam quickly. -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person wit

Re: [Pacemaker] [ha-wg-technical] [ha-wg] [Linux-HA] [RFC] Organizing HA Summit 2015

2014-11-25 Thread Digimer
, 'user', 'announce' list should be enough for all HA. Likewise, one IRC channel should be enough, too. The trick will be discussing this without bikeshedding. :) digimer -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapp

Re: [Pacemaker] [Cluster-devel] [ha-wg] [Linux-HA] [RFC] Organizing HA Summit 2015

2014-11-25 Thread Digimer
y, you guys know my background, let me know if there is a topic you'd like me to cover for the user side of things. -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? __

Re: [Pacemaker] Fencing of bare-metal remote nodes

2014-11-25 Thread Digimer
s on the surviving node after panicing the other node would also be helpful. -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? ___ Pacemaker mailing list: P

Re: [Pacemaker] [ha-wg-technical] [Cluster-devel] [Linux-HA] [ha-wg] [RFC] Organizing HA Summit 2015

2014-11-25 Thread Digimer
lk on it with a demo. - containerisation of services (cgroups, docker, virt) - resource-agents (upstream releases, handling of pull requests, testing) User-facing topics could include recent features (ie. pacemaker-remoted, crm_resource --restart) and common deployment scenarios (eg. NFS) tha

Re: [Pacemaker] [ha-wg] [RFC] Organizing HA Summit 2015

2014-11-24 Thread Digimer
On 24/11/14 10:12 AM, Lars Marowsky-Bree wrote: Beijing, the US, Tasmania (OK, one crazy guy), various countries in Oh, bring him! crazy++ -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education

Re: [Pacemaker] [ha-wg] [RFC] Organizing HA Summit 2015

2014-11-24 Thread Digimer
ustify travel expenses & time off). Alternatively, would a relocation to a more connected venue help, such as Vienna xor Prague? Personally, I don't care where we meet, but I do believe Fabio already ruled out a relocation. I'd love to get some more feedback from the community.

Re: [Pacemaker] TOTEM Retransmit list in logs when a node gets up

2014-11-13 Thread Digimer
@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure

Re: [Pacemaker] drbd / libvirt / Pacemaker Cluster?

2014-11-13 Thread Digimer
ce-peer.sh unfence handler), tell DRBD to use the 'resource-and-stonith' fencing policy and things should start working predictably. If not, please reply back with log snippets. -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind

Re: [Pacemaker] stonith q

2014-11-04 Thread Digimer
can come in, examine the system, reboot the node and unfence the node once it has rebooted, restoring network connections. I created a proof of concept fence agent doing this with D-Link switches: https://github.com/digimer/fence_dlink_snmp It should be easy enough to adapt to, say, call the

Re: [Pacemaker] stonith q

2014-11-04 Thread Digimer
x27;s XML validation data. You can see example of what this looks like by calling 'fence_ipmilan -o metadata' (or any other fence_* agent). For the record, I think this is a bad idea. -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the m

Re: [Pacemaker] stonith q

2014-11-02 Thread Digimer
On 02/11/14 06:45 AM, Andrei Borzenkov wrote: В Sun, 2 Nov 2014 10:01:59 + Alex Samad - Yieldbroker пишет: -Original Message- From: Digimer [mailto:li...@alteeve.ca] Sent: Sunday, 2 November 2014 9:49 AM To: The Pacemaker cluster resource manager Subject: Re: [Pacemaker

Re: [Pacemaker] stonith q

2014-11-01 Thread Digimer
passwd="Initial1" op monitor interval=10s pcs cluster cib-push stonith_cfg pcs property set stonith-enabled=true Hope this helps. -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education?

Re: [Pacemaker] [Linux-HA] [ha-wg] [RFC] Organizing HA Summit 2015

2014-10-31 Thread Digimer
. Cheers Fabio ___ Linux-HA mailing list linux...@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person wit

Re: [Pacemaker] 2 Node Clustering, when primary server goes down(shutdown) the secondary server restarts

2014-10-29 Thread Digimer
x27;m using is DELL POWEREDGE R320 This is how I configure IPMI on RHEL-based distros. Once you find the Ubuntu packages, the configuration process should be the same: https://alteeve.ca/w/AN!Cluster_Tutorial_2#What_is_IPMI -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cur

Re: [Pacemaker] 2 Node Clustering, when primary server goes down(shutdown) the secondary server restarts

2014-10-28 Thread Digimer
ase add it, test it and then hook DRBD into it. OS -> UBUNTU 12.04 (64 bits) DRBD -> 8.3.11 That is quite old. Can you update to 8.3.16? Also, what version is pacemaker and corosync? Thanks for the quick reply On Tue, Oct 28, 2014 at 11:19 AM, Digimer mailto:li...@alteeve.ca>> wrot

Re: [Pacemaker] fencing with multiple node cluster

2014-10-28 Thread Digimer
ve.ca/w/AN!Cluster_Tutorial_2#A_Map.21 It's trivial to scale the idea up to multiple node clusters. Cheers -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? __

Re: [Pacemaker] 2 Node Clustering, when primary server goes down(shutdown) the secondary server restarts

2014-10-27 Thread Digimer
l -f -n 0 /var/log/messages', kill a node and wait for things to settle down. Share the log output here. Please also tell us your OS, pacemaker, drbd and corosync versions. -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in

Re: [Pacemaker] split brain - after network recovery - resources can still be migrated

2014-10-26 Thread Digimer
On 26/10/14 12:32 PM, Andrei Borzenkov wrote: В Sun, 26 Oct 2014 12:01:03 +0100 Vladimir пишет: On Sat, 25 Oct 2014 19:11:02 -0400 Digimer wrote: On 25/10/14 06:35 PM, Vladimir wrote: On Sat, 25 Oct 2014 17:30:07 -0400 Digimer wrote: On 25/10/14 05:09 PM, Vladimir wrote: Hi

Re: [Pacemaker] split brain - after network recovery - resources can still be migrated

2014-10-25 Thread Digimer
On 25/10/14 06:35 PM, Vladimir wrote: On Sat, 25 Oct 2014 17:30:07 -0400 Digimer wrote: On 25/10/14 05:09 PM, Vladimir wrote: Hi, currently I'm testing a 2 node setup using ubuntu trusty. # The scenario: All communication links betwenn the 2 nodes are cut off. This results in a

Re: [Pacemaker] split brain - after network recovery - resources can still be migrated

2014-10-25 Thread Digimer
hy stonith is required in clusters. Even with quorum, you can't assume anything about the state of the peer until it is fenced, so it would only give you a false sense of security. -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a

Re: [Pacemaker] MySQL, Percona replication manager - split brain

2014-10-25 Thread Digimer
ate? Is it enough just to wait for failure, then - restart mysql by hand and clean row with dup index in slave db, and then run resource again? Or there is some automation for such cases? How are you sharing data? Can you give us a better understanding of your setup? -- Digimer Papers and Proje

Re: [Pacemaker] Y should pacemaker be started simultaneously.

2014-10-17 Thread Digimer
On 18/10/14 12:18 AM, Andrei Borzenkov wrote: В Mon, 06 Oct 2014 10:27:49 -0400 Digimer пишет: On 06/10/14 02:11 AM, Andrei Borzenkov wrote: On Mon, Oct 6, 2014 at 9:03 AM, Digimer wrote: If stonith was configured, after the time out, the first node would fence the second node ("unab

Re: [Pacemaker] Linux HA setup for CentOS 6.5

2014-10-15 Thread Digimer
back" at all. In fact, cman (and rgmanager) are gone entirely and the stack has unified on corosync + pacemaker on all major distros now. I've got an incomplete history of all this here: https://alteeve.ca/w/History_of_HA_Clustering Cheers -- Digimer Papers and Projects: https:/

Re: [Pacemaker] Linux HA setup for CentOS 6.5

2014-10-15 Thread Digimer
e is to s/83/84/: + yum install drbd84-utils kmod-drbd84 - yum install drbd83-utils kmod-drbd83 If you run into any troubles, please share details and I am sure we'll get you sorted out in no time. Cheers -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapp

Re: [Pacemaker] Linux HA setup for CentOS 6.5

2014-10-15 Thread Digimer
ndles it in the main /etc/cluster/cluster.conf (cman's main config file). In any case, from then on, start pacemaker and let it handle everything else. Cheers digimer On 15/10/14 04:27 AM, Sihan Goi wrote: Hi, So I've decided to make things simpler and go with a wired netwo

Re: [Pacemaker] Linux HA setup for CentOS 6.5

2014-10-14 Thread Digimer
/etc/sysconfig/network-scripts/ifcfg-lo -rw-r--r--. 1 root root 213 Mar 13 2013 /etc/sysconfig/network-scripts/ifcfg-vbr2 I've never seen an EL6 install without the files there, 'network' or NetworkManager aside. digimer On 14/10/14 11:32 PM, Sihan Goi wrote: There aren&#

Re: [Pacemaker] Linux HA setup for CentOS 6.5

2014-10-14 Thread Digimer
ot;no" and then start it up with /etc/sysconfig/network start. I could be a bit wrong, but I am sure you can make wireless work without NM. Question; Servers with WLAN? I assume these won't be used for corosync? digimer On 14/10/14 11:17 PM, Sihan Goi wrote: Hi, Is there a tuto

Re: [Pacemaker] Y should pacemaker be started simultaneously.

2014-10-06 Thread Digimer
On 06/10/14 02:11 AM, Andrei Borzenkov wrote: On Mon, Oct 6, 2014 at 9:03 AM, Digimer wrote: If stonith was configured, after the time out, the first node would fence the second node ("unable to reach" != "off"). Alternatively, you can set corosync to 'wait_for_all&#x

Re: [Pacemaker] Y should pacemaker be started simultaneously.

2014-10-05 Thread Digimer
nnection to know for sure what it's doing. Alternatively, by fencing the peer on start after timing out, it can say for sure that the peer is off and then start services knowing it won't cause a split-brain. Of course, if you auto-start the cluster and don't use wait_for_all, you ri

Re: [Pacemaker] Managing DRBD Dual Primary with Pacemaker always initial Split Brains

2014-10-02 Thread Digimer
On 02/10/14 02:44 AM, Felix Zachlod wrote: I am currently running 8.4.5 on to of Debian Wheezy with Pacemaker 1.1.7 Please upgrade to 1.1.10+! -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education

Re: [Pacemaker] Managing DRBD Dual Primary with Pacemaker always initial Split Brains

2014-10-01 Thread Digimer
1" clone-max="2" clone-node-max="1" notify="true" target-role="Master" location l-drbd1 ms_drbd_testdata1 \ rule $id="l-drbd1-rule" 0: #uname eq storage-test-d or #uname eq storage-test-c Thanks for any hints in advance, Felix Please c

Re: [Pacemaker] Pacemaker on system with disk failure

2014-09-23 Thread Digimer
You don't have real fencing configured, by the looks of it. Without real, working fencing, recovery can be unpredictable. Can you set that up and see if the problem goes away? digimer On 23/09/14 09:59 AM, Carsten Otto wrote: On Tue, Sep 23, 2014 at 09:50:12AM -0400, Digimer wrote: Ca

Re: [Pacemaker] Pacemaker on system with disk failure

2014-09-23 Thread Digimer
Can you share your pacemaker and drbd configurations please? digimer On 23/09/14 09:39 AM, Carsten Otto wrote: Hello, I run Corosync + Pacemaker + DRBD in a two node cluster, where all resources are part of a group/colocated with DRBD (DRBD + virtual IP + filesystem + ...). To test my

Re: [Pacemaker] [RFC] Organizing HA Summit 2015

2014-09-22 Thread Digimer
/ How is this looking? -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterl

Re: [Pacemaker] [Linux-HA] [RFC] Organizing HA Summit 2015

2014-09-09 Thread Digimer
rage was "Might have been nice to ask first". Really irritated was when you reported the list to spamcop. I'm not sure what history there is here, but I really want this meeting to go well and be productive. Can I play moderator and ask that we set old issues aside? *digimer

Re: [Pacemaker] [Linux-HA] [RFC] Organizing HA Summit 2015

2014-09-09 Thread Digimer
ed. I hardly fail to see that as an issue. Apologies not accepted ;) Fabio +1 to err'ing on the side of too much talk. :) -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to

Re: [Pacemaker] [RFC] Organizing HA Summit 2015

2014-09-08 Thread Digimer
elp simplify life for new users looking for help from or to wanting to join the HA community. I also understand that Fabio will buy the first round of drinks. >:) -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person

Re: [Pacemaker] no failover if fencing device is unreachable (i.e. power loss)

2014-08-18 Thread Digimer
Yes there is; stonith_admin --confirm= I know you will confirm this, but it needs to be stated how critical it is that you really have confirmed the node is off. digimer On 18/08/14 02:01 PM, Felix Schrage wrote: Thanks for the quick answer. I'll have a look at that. Is there a w

Re: [Pacemaker] no failover if fencing device is unreachable (i.e. power loss)

2014-08-18 Thread Digimer
le: /var/log/cluster/corosync.log debug: off timestamp: on logger_subsys { subsys: AMF debug: off } } amf { mode: disabled } service { ver:1 name: pacemaker } Any idea what could be missing/wrong? Kind regards, Felix ___

Re: [Pacemaker] Favor one node during stonith?

2014-08-13 Thread Digimer
On 13/08/14 11:37 AM, Andrey Borzenkov wrote: В Wed, 13 Aug 2014 08:56:58 -0400 Digimer пишет: On 13/08/14 08:37 AM, Andrey Borzenkov wrote: Hi, Sorry for may be basic question, but it is my first Linux HA project. I (will) have two node cluster in active/passive configuration - single

Re: [Pacemaker] Favor one node during stonith?

2014-08-13 Thread Digimer
tly powering off, thus further reducing the chance of a dual fence because now, even if the delay has failed, there is only a fraction of a second between the slower node being fenced and being disabled. hth digimer -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure

Re: [Pacemaker] Announcing 1.1.12 - Final

2014-07-21 Thread Digimer
Congrats! I know you've been busting your ass to get this out for some time now. :) digimer On 21/07/14 10:22 PM, Andrew Beekhof wrote: I am pleased to report that 1.1.12 is finally done. This is a really great release and includes three key improvements: - ACLs are now on by de

Re: [Pacemaker] Up-To-Date How To (Not Jaking "Clusters on Virtualized Platforms")

2014-07-18 Thread Digimer
some nice documentation as well. Kind Regards, Nick. You'll need one of the SUSE folks to pipe in on that. I'm super-conservative when it comes to HA and don't stray far from vendor-supported except when absolutely necessary... As you can probably tell. :P -- Digimer P

Re: [Pacemaker] Up-To-Date How To (Not Jaking "Clusters on Virtualized Platforms")

2014-07-18 Thread Digimer
g changes came with RHEL 7, I want to see how things settle before calling a tutorial "production ready" there. -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without

Re: [Pacemaker] clusters on virtualised platforms

2014-07-17 Thread Digimer
he HA components. digimer On 17/07/14 11:36 PM, Nick Cameo wrote: "Instead, have the HA hypervisor layer protect the VM as a clustered service" I had to read this a couple of times Lars, and it's interesting. If I understand correctly run the cluster on bare metal, taking care of

Re: [Pacemaker] clusters on virtualised platforms

2014-07-16 Thread Digimer
On 17/07/14 02:39 PM, Alex Samad - Yieldbroker wrote: -Original Message- From: Digimer [mailto:li...@alteeve.ca] Sent: Thursday, 17 July 2014 3:00 PM To: The Pacemaker cluster resource manager Subject: Re: [Pacemaker] clusters on virtualised platforms On 17/07/14 01:41 PM, Alex Samad

Re: [Pacemaker] clusters on virtualised platforms

2014-07-16 Thread Digimer
On 17/07/14 01:41 PM, Alex Samad - Yieldbroker wrote: -Original Message- From: Digimer [mailto:li...@alteeve.ca] Sent: Thursday, 17 July 2014 2:02 PM To: The Pacemaker cluster resource manager Subject: Re: [Pacemaker] clusters on virtualised platforms Don't confuse quorum and fe

Re: [Pacemaker] clusters on virtualised platforms

2014-07-16 Thread Digimer
visor and force a power off. Generally speaking, VM-based cluster nodes are good for learning, but not production. It adds a layer that isn't needed and in HA, simple should trump all else. digimer On 17/07/14 12:48 PM, Alex Samad - Yieldbroker wrote: Hi I wonder if there Best practise

Re: [Pacemaker] Creating a safe cluster-node shutdown script (for when UPS goes OnBattery+LowBattery)

2014-07-04 Thread Digimer
k more time to shut down than the batteries could support. So half-way through, we withdrew one node and powered it off to shed load and gain battery runtime. This kind of logic can not reasonably be coded into a script. My $0.02. -- Digimer Papers and Projects: https://alteeve.ca/w/ What

Re: [Pacemaker] DRBD active/passive on Pacemaker+CMAN cluster unexpectedly performs STONITH when promoting

2014-07-03 Thread Digimer
, and hasn't been tested. It was meant to be an updated replacement for obliterate-peer.sh in cman+rgmanager clusters directly (no pacemaker). Cheers -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education?

Re: [Pacemaker] co-location of STONITH resources

2014-06-25 Thread Digimer
ss2 allocation score on lustre-mds2.ften.es.hpcn.uzh.ch: -INFINITY native_color: stonith-lustre-oss2 allocation score on lustre-oss1.ften.es.hpcn.uzh.ch: -INFINITY ... What are we doing wrong? Thanks, Riccardo -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for c

Re: [Pacemaker] Pacemaker Managed Service Not Started

2014-06-25 Thread Digimer
h I feel uneasy), but I postpone all that at least until all my services run at the places that I set them to run. Thank you. On 25/06/14 12:47, Digimer wrote: On 25/06/14 01:43 AM, Ariel S wrote: stonith-enabled="false" \ no-quorum-policy="ignore&qu

Re: [Pacemaker] What is the cman package for ubuntu 13.10

2014-06-25 Thread Digimer
ack. Thank you, Kostya On Wed, Jun 25, 2014 at 3:33 AM, Digimer mailto:li...@alteeve.ca>> wrote: I can't speak to the installation bits, I don't use Ubuntu/Debian myself, but once installed, this guide should apply: http://clusterlabs.org/doc/en-__US/Pa

Re: [Pacemaker] Pacemaker Managed Service Not Started

2014-06-24 Thread Digimer
On 25/06/14 01:43 AM, Ariel S wrote: stonith-enabled="false" \ no-quorum-policy="ignore" \ This may or may not relate to you problem, but you *must* have stonith configured and tested, specially in 2-node clusters where you can't use quorum

Re: [Pacemaker] What is the cman package for ubuntu 13.10

2014-06-24 Thread Digimer
I can't speak to the installation bits, I don't use Ubuntu/Debian myself, but once installed, this guide should apply: http://clusterlabs.org/doc/en-US/Pacemaker/1.1-crmsh/html-single/Clusters_from_Scratch/index.html Cheeers On 24/06/14 07:40 PM, Vijay B wrote: Hi Digimer, Than

Re: [Pacemaker] What is the cman package for ubuntu 13.10

2014-06-24 Thread Digimer
ue dios quiera ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org <mailto:Pacemaker@oss.clusterlabs.org> http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch

Re: [Pacemaker] configuration variants for 2 node cluster

2014-06-24 Thread Digimer
On 24/06/14 03:55 AM, Christine Caulfield wrote: On 23/06/14 15:49, Digimer wrote: Hi Kostya, I'm having a little trouble understanding your question, sorry. On boot, the node will not start anything, so after booting it, you log in, check that it can talk to the peer node (a s

Re: [Pacemaker] configuration variants for 2 node cluster

2014-06-23 Thread Digimer
On 23/06/14 12:50 PM, Kostiantyn Ponomarenko wrote: Digimer, I am using Debian as OS and Corosync + Pacemaker as cluster stack. I understand your suggestion. I don't have any questions about it. My main question is how to do it automatically? So that it could work without human interruptio

Re: [Pacemaker] configuration variants for 2 node cluster

2014-06-23 Thread Digimer
ck. I think, maybe, you are looking at things more complicated than you need to. Pacemaker and corosync will handle most of this for you, once setup properly. What operating system do you plan to use, and what cluster stack? I suspect it will be corosync + pacemaker, which should work fine. di

Re: [Pacemaker] configuration variants for 2 node cluster

2014-06-23 Thread Digimer
g node. The problem is when a node vanishes and fencing fails. Then, not knowing what the other node might be doing, the only safe option is to block, otherwise you risk a split-brain. This is why fencing is so important. Cheers -- Digimer Papers and Projects: https://alteeve.ca/w/ Wh

Re: [Pacemaker] How to put delay in fence_intelmodular for one node only

2014-06-21 Thread Digimer
: Hi Gianluca, I'm not sure of the CIB XML syntax, but here is how it's done using pcs: OK, thanks Digimer. It seems it worked this way using your suggestions [root@srvmgmt01 ~]# pcs stonith show Fencing(stonith:fence_intelmodular):Started # pcs cluster cib stonith_sepa

Re: [Pacemaker] How to put delay in fence_intelmodular for one node only

2014-06-21 Thread Digimer
th_cfg Note how the first (preferred node) has the delay="15" and the peer does not. This uses fence_ipmilan, but just swap it (and the attributes) for your fence method. digimer -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the

Re: [Pacemaker] CMAN with multiple networks for HA communication

2014-06-19 Thread Digimer
Using raw IPs is not recommended. I remember Chrissie explaining the reason why a couple of years ago, but I can't remember that reasoning right not. As for multiple altname's, that is not possible. Corosync's RRP (redundant ring protocol) supports a primary and backup ring

Re: [Pacemaker] CMAN with multiple networks for HA communication

2014-06-19 Thread Digimer
at 9:20 PM, Digimer wrote: On 19/06/14 08:00 AM, Teerapatr Kittiratanachai wrote: Dear List, According to CMAN identify each node in cluster by hostname. So I configure the IP address in /etc/hosts file as below. ... 172.16.7.1node00.example.com 172.16.7.2node01.example.com ... So th

Re: [Pacemaker] CMAN with multiple networks for HA communication

2014-06-19 Thread Digimer
me resolve to the network you want to use as the backup link. -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? ___ Pacemaker mailing list: Pacemaker

Re: [Pacemaker] Trying to figure out a constraint

2014-06-18 Thread Digimer
On 19/06/14 12:06 AM, Digimer wrote: After sending this, I found that adding: handlers { fence-peer "/usr/lib/drbd/crm-fence-peer.sh"; after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh"; } Allowed the constraint to be removed, so eventually node 2 (a

Re: [Pacemaker] Trying to figure out a constraint

2014-06-18 Thread Digimer
On 18/06/14 11:42 PM, Digimer wrote: On 18/06/14 12:47 AM, Andrew Beekhof wrote: On 18 Jun 2014, at 2:03 pm, Digimer wrote: Hi all, I am trying to setup a basic pacemaker 1.1.10 on RHEL 6.5 with DRBD 8.3.16. I've setup DRBD and configured one clustered LVM volume group using that

[Pacemaker] Trying to figure out a constraint

2014-06-17 Thread Digimer
Hi all, I am trying to setup a basic pacemaker 1.1.10 on RHEL 6.5 with DRBD 8.3.16. I've setup DRBD and configured one clustered LVM volume group using that drbd resource as the PV. With DRBD configured alone, I can stop/start pacemaker repeatedly without issue. However, when I add the

Re: [Pacemaker] pacemaker with CMAN on RHEL 6.5

2014-06-16 Thread Digimer
: From the previous output i seen this "Version: 1.1.10-5.el7-9abe687", so my question is "cman is used on redhat7?" 2014-06-16 18:10 GMT+02:00 Digimer : You need to setup a skeleton cluster.conf file. You can use the 'ccs' tool, here is an example (I use for my clu

Re: [Pacemaker] pacemaker with CMAN on RHEL 6.5

2014-06-16 Thread Digimer
man/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without

Re: [Pacemaker] DRBD primary/primary + Pacemaker goes into split brain after crm node standby/online

2014-06-09 Thread Digimer
cked until the fence succeeds. It will only resume when the peer is in a known state (off), thus avoiding split-brains entirely. And, and Andrew said, upgrade pacemaker. :) -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is

Re: [Pacemaker] Dual primary DRBD+OCFS2+XEN+Pacemaker failover issues

2014-06-05 Thread Digimer
On 06/06/14 12:16 AM, Digimer wrote: Many, many people use just stonith and it's enough for recovery in most failure cases. just IPMI-based* stonith... -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without acce

  1   2   3   4   >