Re: [Pacemaker] Pacemaker very often STONITHs other node

2013-12-09 Thread Nikita Staroverov
Hello, Thank you for your answer. I have two drbd - /dev/drbd1 and /dev/drbd2. And I use them as PVs for LVM which has one Volume Group hosting all the VMs. So should I have as many DRBDs as VMs and get rid off LVM at all? PS. If it is not a secret what are you recommended timeouts? Thank

Re: [Pacemaker] Pacemaker very often STONITHs other node

2013-12-09 Thread Michał Margula
W dniu 09.12.2013 11:34, Nikita Staroverov pisze: So, what happens? :) Rivendell-B tried to stop XEN-acsystemy01, but couldn't do that due to time out of operation. Failure on stop operation is fatal by default and leading to stonith. Rivendell-A caught this and fence rivendell-B. You also have

Re: [Pacemaker] Pacemaker very often STONITHs other node

2013-12-09 Thread Nikita Staroverov
Hello, Still did not receive any hints from you. And you are definitely my only hope before I switch to Proxmox or (even worse) some commercial stuff. At least can you tell mi if mode 4 could cause trouble with Corosync? Thanks! According to your logs, posted before, the reason was: Nov

Re: [Pacemaker] Pacemaker very often STONITHs other node

2013-12-07 Thread Digimer
On 07/12/13 05:12, Michał Margula wrote: > W dniu 25.11.2013 18:43, Michał Margula pisze: >> >> I use 802.3ad mode (so it is mode 4): >> >> auto bond0 >> iface bond0 inet static >> slaves eth4 eth5 >> bond-mode 802.3ad >> bond-lacp_rate fast >> bond-miimon 100 >>

Re: [Pacemaker] Pacemaker very often STONITHs other node

2013-12-07 Thread Michał Margula
W dniu 25.11.2013 18:43, Michał Margula pisze: I use 802.3ad mode (so it is mode 4): auto bond0 iface bond0 inet static slaves eth4 eth5 bond-mode 802.3ad bond-lacp_rate fast bond-miimon 100 bond-downdelay 200 bond-updelay 200 addre

Re: [Pacemaker] Pacemaker very often STONITHs other node

2013-11-25 Thread Michał Margula
W dniu 25.11.2013 18:25, Digimer pisze: I'd like to see the full logs, starting from a little before the issue started. Here are logs since Nov 17 until Nov 24 (my pastebin is too small to handle them): Node A - https://www.dropbox.com/sh/dj08fbckj9zo104/Ew1QpdRq9A/A.log Node B - https://ww

Re: [Pacemaker] Pacemaker very often STONITHs other node

2013-11-25 Thread Digimer
On 25/11/13 10:39, Michał Margula wrote: > W dniu 25.11.2013 15:44, Digimer pisze: >> My first thought is that the network is congested. That is a lot of >> servers to have on the system. Do you or can you isolate the corosync >> traffic from the drbd traffic? >> >> Personally, I always setup a ded

Re: [Pacemaker] Pacemaker very often STONITHs other node

2013-11-25 Thread Michał Margula
W dniu 25.11.2013 15:44, Digimer pisze: My first thought is that the network is congested. That is a lot of servers to have on the system. Do you or can you isolate the corosync traffic from the drbd traffic? Personally, I always setup a dedicated network for corosync, another for drbd and a thi

Re: [Pacemaker] Pacemaker very often STONITHs other node

2013-11-25 Thread Digimer
On 25/11/13 06:40, Michał Margula wrote: > Hello! > > I wanted to ask for your help because we are having much trouble with > cluster based on Pacemaker. > > We have two identical nodes - PowerEdge R510 with 2x Xeon X5650, 64 GB > of RAM, MegaRAID SAS 2108 RAID (PERC H700) - system disk - RAID 1