Re: KVM HA fails under multiple management services

2019-06-24 Thread Andrija Panic
it does not solve 100% of > the failure of KVM HA; > > Because in extreme cases, the management server and the kvm host may fail > at the same time (for example, the management server and the KVM HOST are > placed in the same rack, and the RACK will fail at the same time after the

KVM HA fails under multiple management services

2019-06-23 Thread li jerry
Thank you Nicolas and Andrija. Even if indirect.agent.lb.algorithm is configured as roundrobin, the probability of failure can only be reduced. But it does not solve 100% of the failure of KVM HA; Because in extreme cases, the management server and the kvm host may fail at the same time (for

Re: KVM HA fails under multiple management services

2019-06-23 Thread Nicolas Vazquez
domingo, 23 de junio 11:03 Asunto: Re: KVM HA fails under multiple management services Para: users Cc: dev@cloudstack.apache.org Li, based on the Global Setting description for those 2, I would say that is the expected behaviour. i.e. change Indirect.agent.lb.check.interval to some other value,

Re: KVM HA fails under multiple management services

2019-06-23 Thread Andrija Panic
Li, based on the Global Setting description for those 2, I would say that is the expected behaviour. i.e. change Indirect.agent.lb.check.interval to some other value, since 0 means "don't check, don't reconnect" per what I read. Also, you might want to change from Indirect.agent.lb.algorithm=sta

KVM HA fails under multiple management services

2019-06-21 Thread li jerry
Hello everyone I recently tested the multiple management services, based on agent lb HOST HA (KVM). It was found that in extreme cases, HA would fail; the details are as follows: Two management nodes, M1 (172.17.1.141) and M2 (172.17.1.142), share an external database cluster Three KVM nodes,

Re: KVM HA BUG 4.11.1.0 centos 7

2018-08-13 Thread Thomas Heil
Hi, Ive added more hosts and enabled ha on all of them. Now i shoot down node cs-hv-06, which is running r-199. Here are the logs iam gettin. -- 2018-08-13 12:12:51,402 DEBUG [c.c.h.HighAvailabilityManagerImpl] (pool-5-thread-1:null) (logid:b71a09c7) Notifying HA Mgr of to restart vm 199-r-199-VM

KVM HA BUG 4.11.1.0 centos 7

2018-08-02 Thread Thomas Heil
Hi, I have a setup with one advanced zone, one cluster and two Hosts. The hosts are KVM and use a single NFS Storage von Primary and one for Secondary. Everything is running smootly until I remove power from one host. In my honest opinion cloudstack should now delcare the faulty host as dead, de

[GitHub] cloudstack issue #1960: [4.11/Future] CLOUDSTACK-9782: Host HA and KVM HA pr...

2017-04-12 Thread serverchief
Github user serverchief commented on the issue: https://github.com/apache/cloudstack/pull/1960 Hi @koushik-das I believe you missed a discussion on this a while back - when this was initially proposed and we were gathering community feedback. please read this thread

[GitHub] cloudstack issue #1960: [4.11/Future] CLOUDSTACK-9782: Host HA and KVM HA pr...

2017-04-11 Thread koushik-das
Github user koushik-das commented on the issue: https://github.com/apache/cloudstack/pull/1960 @abhinandanprateek Initially I also thought that this is about host HA. But after reading the FS I had doubts and asked about the definition of "host HA". If you refer to the discussion on d

[GitHub] cloudstack issue #1960: [4.11/Future] CLOUDSTACK-9782: Host HA and KVM HA pr...

2017-04-10 Thread rhtyd
Github user rhtyd commented on the issue: https://github.com/apache/cloudstack/pull/1960 @koushik-das sorry could not get back to you earlier as I was busy with other work. I've replied on the ML thread to address several queries [1] that lists the advantages of this host-ha framework

[GitHub] cloudstack issue #1960: [4.11/Future] CLOUDSTACK-9782: Host HA and KVM HA pr...

2017-04-09 Thread abhinandanprateek
Github user abhinandanprateek commented on the issue: https://github.com/apache/cloudstack/pull/1960 @koushik-das I see that main issue is that this is being confused as VM HA framework. Will like to again add that this framework is not for VM-HA but for host HA. With this implementat

[GitHub] cloudstack issue #1960: [4.11/Future] CLOUDSTACK-9782: Host HA and KVM HA pr...

2017-04-02 Thread koushik-das
Github user koushik-das commented on the issue: https://github.com/apache/cloudstack/pull/1960 There are open questions that I had asked in the dev@ list and haven't seen satisfactory answers to them. I am -1 on this feature till the need for a new VM HA framework is justified. ---

[GitHub] cloudstack issue #1960: [4.11/Future] CLOUDSTACK-9782: Host HA and KVM HA pr...

2017-04-02 Thread rhtyd
Github user rhtyd commented on the issue: https://github.com/apache/cloudstack/pull/1960 Travis failed due to failure in the test ``test_ha_multiple_mgmt_server_owner...`, I'll have a look shortly --- If your project is set up for it, you can reply to this email and have your reply a

[GitHub] cloudstack issue #1960: [4.11/Future] CLOUDSTACK-9782: Host HA and KVM HA pr...

2017-03-27 Thread borisstoyanov
Github user borisstoyanov commented on the issue: https://github.com/apache/cloudstack/pull/1960 @rhtyd there seems to be a conflict for this merge, I'm currently running tests and will keep you posted --- If your project is set up for it, you can reply to this email and have your re

[GitHub] cloudstack issue #1960: [4.11/Future] CLOUDSTACK-9782: Host HA and KVM HA pr...

2017-03-27 Thread blueorangutan
Github user blueorangutan commented on the issue: https://github.com/apache/cloudstack/pull/1960 Trillian test result (tid-963) Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7 Total time taken: 34953 seconds Marvin logs: https://github.com/blueoranguta

[GitHub] cloudstack issue #1960: [4.11/Future] CLOUDSTACK-9782: Host HA and KVM HA pr...

2017-03-27 Thread blueorangutan
Github user blueorangutan commented on the issue: https://github.com/apache/cloudstack/pull/1960 @borisstoyanov a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests --- If your project is set up for it, you can reply to this email and have your

[GitHub] cloudstack issue #1960: [4.11/Future] CLOUDSTACK-9782: Host HA and KVM HA pr...

2017-03-27 Thread borisstoyanov
Github user borisstoyanov commented on the issue: https://github.com/apache/cloudstack/pull/1960 @blueorangutan test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and w

[GitHub] cloudstack issue #1960: [4.11/Future] CLOUDSTACK-9782: Host HA and KVM HA pr...

2017-03-27 Thread blueorangutan
Github user blueorangutan commented on the issue: https://github.com/apache/cloudstack/pull/1960 Packaging result: ✔centos6 ✔centos7 ✔debian. JID-600 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project d

[GitHub] cloudstack issue #1960: [4.11/Future] CLOUDSTACK-9782: Host HA and KVM HA pr...

2017-03-26 Thread borisstoyanov
Github user borisstoyanov commented on the issue: https://github.com/apache/cloudstack/pull/1960 @blueorangutan package --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled an

[GitHub] cloudstack issue #1960: [4.11/Future] CLOUDSTACK-9782: Host HA and KVM HA pr...

2017-03-26 Thread blueorangutan
Github user blueorangutan commented on the issue: https://github.com/apache/cloudstack/pull/1960 @borisstoyanov a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. --- If your project is set up for it, you can reply to this email and have your re

[GitHub] cloudstack issue #1960: [4.11/Future] CLOUDSTACK-9782: Host HA and KVM HA pr...

2017-03-26 Thread rhtyd
Github user rhtyd commented on the issue: https://github.com/apache/cloudstack/pull/1960 @borisstoyanov copy that, done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled an

[GitHub] cloudstack issue #1960: [4.11/Future] CLOUDSTACK-9782: Host HA and KVM HA pr...

2017-03-23 Thread borisstoyanov
Github user borisstoyanov commented on the issue: https://github.com/apache/cloudstack/pull/1960 @rhtyd PRs 2003 and 2011 got merged, could you please rebase against master so it'll pick up the fixes. Once we package I'll continue with the verification on the physical hosts.

[GitHub] cloudstack issue #1960: [4.11/Future] CLOUDSTACK-9782: Host HA and KVM HA pr...

2017-02-28 Thread koushik-das
Github user koushik-das commented on the issue: https://github.com/apache/cloudstack/pull/1960 @abhinandanprateek If you refer to the discussion on dev@ https://goo.gl/cU8RuX, @rhtyd proposed it as a generic HA framework for any resources (and not limited to VM). Now if it just a repl

[GitHub] cloudstack issue #1960: [4.11/Future] CLOUDSTACK-9782: Host HA and KVM HA pr...

2017-02-28 Thread rhtyd
Github user rhtyd commented on the issue: https://github.com/apache/cloudstack/pull/1960 @koushik-das I've shared a list of advantages of this work over existing framework on dev@ that explain why existing VM-HA framework cannot be used for host-ha implementation. If you've more quest

[GitHub] cloudstack issue #1960: [4.11/Future] CLOUDSTACK-9782: Host HA and KVM HA pr...

2017-02-28 Thread abhinandanprateek
Github user abhinandanprateek commented on the issue: https://github.com/apache/cloudstack/pull/1960 @koushik-das The current framework is specifically implemented for Host HA with KVM HA as the initial implementation. This framework is supposed to replace the framework that is

[GitHub] cloudstack issue #1960: [4.11/Future] CLOUDSTACK-9782: Host HA and KVM HA pr...

2017-02-28 Thread koushik-das
Github user koushik-das commented on the issue: https://github.com/apache/cloudstack/pull/1960 I have already raised some questions on dev@ on the need for a new HA framework when the existing HA framework can do all the things mentioned. The new framework only supports VM HA. If we s

[GitHub] cloudstack issue #1960: [4.11/Future] CLOUDSTACK-9782: Host HA and KVM HA pr...

2017-02-28 Thread rhtyd
Github user rhtyd commented on the issue: https://github.com/apache/cloudstack/pull/1960 Thanks @DaanHoogland wherever applicable I'll address the comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] cloudstack issue #1960: [4.11/Future] CLOUDSTACK-9782: Host HA and KVM HA pr...

2017-02-28 Thread DaanHoogland
Github user DaanHoogland commented on the issue: https://github.com/apache/cloudstack/pull/1960 I went through the code and found no real issues. for any other reviewers I recommend reading the FS first. It is quit a big chunk but very neat. LGTM --- If your project is set u

[GitHub] cloudstack issue #1960: [4.11/Future] CLOUDSTACK-9782: Host HA and KVM HA pr...

2017-02-23 Thread borisstoyanov
Github user borisstoyanov commented on the issue: https://github.com/apache/cloudstack/pull/1960 @rhtyd tests looks good, except this one: ``` ERROR: Tests default ha providers list -- Traceback (most r

[GitHub] cloudstack issue #1960: [4.11/Future] CLOUDSTACK-9782: Host HA and KVM HA pr...

2017-02-22 Thread blueorangutan
Github user blueorangutan commented on the issue: https://github.com/apache/cloudstack/pull/1960 Trillian test result (tid-879) Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7 Total time taken: 28334 seconds Marvin logs: https://github.com/blueoranguta

[GitHub] cloudstack issue #1960: [4.11/Future] CLOUDSTACK-9782: Host HA and KVM HA pr...

2017-02-22 Thread borisstoyanov
Github user borisstoyanov commented on the issue: https://github.com/apache/cloudstack/pull/1960 @blueorangutan test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and w

[GitHub] cloudstack issue #1960: [4.11/Future] CLOUDSTACK-9782: Host HA and KVM HA pr...

2017-02-22 Thread blueorangutan
Github user blueorangutan commented on the issue: https://github.com/apache/cloudstack/pull/1960 @borisstoyanov a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests --- If your project is set up for it, you can reply to this email and have your

[GitHub] cloudstack issue #1960: [4.11/Future] CLOUDSTACK-9782: Host HA and KVM HA pr...

2017-02-22 Thread blueorangutan
Github user blueorangutan commented on the issue: https://github.com/apache/cloudstack/pull/1960 Packaging result: ✔centos6 ✔centos7 ✔debian. JID-525 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project d

[GitHub] cloudstack issue #1960: [4.11/Future] CLOUDSTACK-9782: Host HA and KVM HA pr...

2017-02-22 Thread blueorangutan
Github user blueorangutan commented on the issue: https://github.com/apache/cloudstack/pull/1960 @borisstoyanov a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. --- If your project is set up for it, you can reply to this email and have your re

[GitHub] cloudstack issue #1960: [4.11/Future] CLOUDSTACK-9782: Host HA and KVM HA pr...

2017-02-22 Thread borisstoyanov
Github user borisstoyanov commented on the issue: https://github.com/apache/cloudstack/pull/1960 @blueorangutan package --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled an

[GitHub] cloudstack pull request #1960: CLOUDSTACK-9782: Host HA and KVM HA provider

2017-02-22 Thread rhtyd
GitHub user rhtyd opened a pull request: https://github.com/apache/cloudstack/pull/1960 CLOUDSTACK-9782: Host HA and KVM HA provider Host-HA offers investigation, fencing and recovery mechanisms for host that for any reason are malfunctioning. It uses Activity and Health checks

[GitHub] cloudstack pull request: CLOUDSTACK-9350: KVM-HA- Fix CheckOnHost ...

2016-05-10 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/cloudstack/pull/1496 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] cloudstack pull request: CLOUDSTACK-9350: KVM-HA- Fix CheckOnHost ...

2016-05-09 Thread swill
Github user swill commented on the pull request: https://github.com/apache/cloudstack/pull/1496#issuecomment-218056068 Thanks for confirming @koushik-das. 👍 It is the middle of the night right now, so I am fading fast. --- If your project is set up for it, you can reply to thi

[GitHub] cloudstack pull request: CLOUDSTACK-9350: KVM-HA- Fix CheckOnHost ...

2016-05-09 Thread koushik-das
Github user koushik-das commented on the pull request: https://github.com/apache/cloudstack/pull/1496#issuecomment-218048251 @swill The failures doesn't look related to this PR. This can be merged. --- If your project is set up for it, you can reply to this email and have your reply a

[GitHub] cloudstack pull request: CLOUDSTACK-9350: KVM-HA- Fix CheckOnHost ...

2016-05-09 Thread swill
Github user swill commented on the pull request: https://github.com/apache/cloudstack/pull/1496#issuecomment-217884184 The two failures are not ones I am used to seeing in my environment, but I did a quick once through of the code and I don't think they problem is related to this PR.

[GitHub] cloudstack pull request: CLOUDSTACK-9350: KVM-HA- Fix CheckOnHost ...

2016-05-08 Thread swill
Github user swill commented on the pull request: https://github.com/apache/cloudstack/pull/1496#issuecomment-217772228 ### CI RESULTS ``` Tests Run: 88 Skipped: 2 Failed: 1 Errors: 1 Duration: 11h 25m 09s ``` **Summary of the

[GitHub] cloudstack pull request: CLOUDSTACK-9350: KVM-HA- Fix CheckOnHost ...

2016-05-02 Thread rhtyd
Github user rhtyd commented on the pull request: https://github.com/apache/cloudstack/pull/1496#issuecomment-216228687 LGTM (code review) tag:mergeready --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pro

[GitHub] cloudstack pull request: CLOUDSTACK-9350: KVM-HA- Fix CheckOnHost ...

2016-04-29 Thread jburwell
Github user jburwell commented on the pull request: https://github.com/apache/cloudstack/pull/1496#issuecomment-215890052 LGTM for code review --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] cloudstack pull request: CLOUDSTACK-9350: KVM-HA- Fix CheckOnHost ...

2016-04-29 Thread jburwell
Github user jburwell commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1496#discussion_r61649158 --- Diff: server/src/com/cloud/ha/HighAvailabilityManagerImpl.java --- @@ -264,6 +265,11 @@ public void scheduleRestartForVmsOnHost(final HostVO host, b

[GitHub] cloudstack pull request: CLOUDSTACK-9350: KVM-HA- Fix CheckOnHost ...

2016-04-29 Thread abhinandanprateek
Github user abhinandanprateek commented on the pull request: https://github.com/apache/cloudstack/pull/1496#issuecomment-215651932 @swill rebased it. Thank you. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your proj

[GitHub] cloudstack pull request: CLOUDSTACK-9350: KVM-HA- Fix CheckOnHost ...

2016-04-28 Thread swill
Github user swill commented on the pull request: https://github.com/apache/cloudstack/pull/1496#issuecomment-215552813 @abhinandanprateek please rebase as this currently has merge conflicts with master. Thanks... --- If your project is set up for it, you can reply to this email and

[GitHub] cloudstack pull request: CLOUDSTACK-9350: KVM-HA- Fix CheckOnHost ...

2016-04-27 Thread koushik-das
Github user koushik-das commented on the pull request: https://github.com/apache/cloudstack/pull/1496#issuecomment-215026662 Code changes LGTM, verified changes in HighAvailabilityManagerImpl.java. Someone else needs to verify KVM related changes. --- If your project is set up for it

[GitHub] cloudstack pull request: CLOUDSTACK-9350: KVM-HA- Fix CheckOnHost ...

2016-04-17 Thread alexandrelimassantana
Github user alexandrelimassantana commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1496#discussion_r60004052 --- Diff: server/src/com/cloud/ha/HighAvailabilityManagerImpl.java --- @@ -264,6 +265,11 @@ public void scheduleRestartForVmsOnHost(final Ho

[GitHub] cloudstack pull request: CLOUDSTACK-9350: KVM-HA- Fix CheckOnHost ...

2016-04-17 Thread cristofolini
Github user cristofolini commented on the pull request: https://github.com/apache/cloudstack/pull/1496#issuecomment-211149034 @abhinandanprateek How about extracting lines 70-82 in `KVMInvestigator` to their own method? I think it would be nice to get a test case going for that sectio

[GitHub] cloudstack pull request: CLOUDSTACK-9350: KVM-HA- Fix CheckOnHost ...

2016-04-15 Thread jburwell
Github user jburwell commented on the pull request: https://github.com/apache/cloudstack/pull/1496#issuecomment-210619709 @abhinandanprateek Jenkins failed due to no license header in the new test case, test_host_ha.py. Travis failed because it could not find the Marvin egg. --- If

[GitHub] cloudstack pull request: CLOUDSTACK-9350: KVM-HA- Fix CheckOnHost ...

2016-04-15 Thread abhinandanprateek
Github user abhinandanprateek commented on the pull request: https://github.com/apache/cloudstack/pull/1496#issuecomment-210422617 Marvin output with only Local storage: 49b-b201830aec39 VirtualMachine on Hyp 1 = 1d500ef3-ef56-4d11-91d8-c0e37f015236 VirtualMachine on Hy

[GitHub] cloudstack pull request: CLOUDSTACK-9350: KVM-HA- Fix CheckOnHost ...

2016-04-15 Thread abhinandanprateek
Github user abhinandanprateek commented on the pull request: https://github.com/apache/cloudstack/pull/1496#issuecomment-210422584 Marvin test output for NFS PS: hine on Hyp 1 = 228dc462-ceab-4225-8d35-5749cbb593c2 VirtualMachine on Hyp 1 = 2caf838f-016f-4cf6-9d35-67a2e1398

[GitHub] cloudstack pull request: CLOUDSTACK-9350: KVM-HA- Fix CheckOnHost ...

2016-04-15 Thread abhinandanprateek
GitHub user abhinandanprateek opened a pull request: https://github.com/apache/cloudstack/pull/1496 CLOUDSTACK-9350: KVM-HA- Fix CheckOnHost for Local storage - KVM-HA- Fix CheckOnHost for Local storage - Also skip HA on VMs that are using local storage You can merge this

Re: [DISCUSS] KVM HA with IPMI Fencing

2015-12-10 Thread John Burwell
ntage of this approach is a high cohesion of code support KVM integration and decouple the plugin from a specific system management interface type. Per the FS, the KVM HA provider would be defined with a cluster scope/partition and determine that hosts meeting the criteria as eligible for HA: *

Re: [DISCUSS] KVM HA with IPMI Fencing

2015-10-20 Thread Ronald van Zantvoort
On 19/10/15 23:10, ilya wrote: Ronald, Please see response in-line... And you too :) On 10/19/15 2:18 AM, Ronald van Zantvoort wrote: On 16/10/15 00:21, ilya wrote: I noticed several attempts to address the issue with KVM HA in Jira and Dev ML. As we all know, there are many ways to solve

Re: [DISCUSS] KVM HA with IPMI Fencing

2015-10-19 Thread ilya
Saw this message a bit later, i tried to break it down and respond.. On 10/19/15 2:24 AM, Ronald van Zantvoort wrote: > On 19/10/15 11:18, Ronald van Zantvoort wrote: >> On 16/10/15 00:21, ilya wrote: >>> I noticed several attempts to address the issue with KVM HA in Jira and

Re: [DISCUSS] KVM HA with IPMI Fencing

2015-10-19 Thread ilya
Ronald, Please see response in-line... On 10/19/15 2:18 AM, Ronald van Zantvoort wrote: > On 16/10/15 00:21, ilya wrote: >> I noticed several attempts to address the issue with KVM HA in Jira and >> Dev ML. As we all know, there are many ways to solve the same problem, >&g

Re: [DISCUSS] KVM HA with IPMI Fencing

2015-10-19 Thread Ronald van Zantvoort
On 19/10/15 11:18, Ronald van Zantvoort wrote: On 16/10/15 00:21, ilya wrote: I noticed several attempts to address the issue with KVM HA in Jira and Dev ML. As we all know, there are many ways to solve the same problem, on our side, we've given it some thought as well - and its on our

Re: [DISCUSS] KVM HA with IPMI Fencing

2015-10-19 Thread Ronald van Zantvoort
On 16/10/15 00:21, ilya wrote: I noticed several attempts to address the issue with KVM HA in Jira and Dev ML. As we all know, there are many ways to solve the same problem, on our side, we've given it some thought as well - and its on our to do list. Specifically a mail thread "

[DISCUSS] KVM HA with IPMI Fencing

2015-10-15 Thread ilya
I noticed several attempts to address the issue with KVM HA in Jira and Dev ML. As we all know, there are many ways to solve the same problem, on our side, we've given it some thought as well - and its on our to do list. Specifically a mail thread "KVM HA is broken, let's fix

Re: KVM HA is broken, let's fix it

2015-10-12 Thread Frank Louwers
> On 10 Oct 2015, at 12:35, Remi Bergsma wrote: > > Can you please explain what the issue is with KVM HA? In my tests, HA starts > all VMs just fine without the hypervisor coming back. At least that is on > current 4.6. Assuming a cluster of multiple nodes of course. It wi

Re: KVM HA is broken, let's fix it

2015-10-10 Thread Nux!
m: "Remi Bergsma" > To: dev@cloudstack.apache.org > Cc: "Cloudstack Users List" > Sent: Saturday, 10 October, 2015 11:35:36 > Subject: Re: KVM HA is broken, let's fix it > Hi Lucian, > > Can you please explain what the issue is with KVM HA? In my test

Re: KVM HA is broken, let's fix it

2015-10-10 Thread Remi Bergsma
Hi Lucian, Can you please explain what the issue is with KVM HA? In my tests, HA starts all VMs just fine without the hypervisor coming back. At least that is on current 4.6. Assuming a cluster of multiple nodes of course. It will then do a neighbor check from another host in the same cluster

KVM HA is broken, let's fix it

2015-10-09 Thread Nux!
Hello, Following a recent thread on the users ml where slow NFS caused a mass reboot, I have opened the following issue about improving HA on KVM. https://issues.apache.org/jira/browse/CLOUDSTACK-8943 I know there are many people around here who use KVM and are interested in a more robust way

Re: KVM HA

2015-05-03 Thread Nux!
This can be done by enabling HA in the service offering for an instance. -- Sent from the Delta quadrant using Borg technology! Nux! www.nux.ro - Original Message - > From: "Budur Nagaraju" > To: dev@cloudstack.apache.org > Sent: Thursday, 30 April, 2015 10:37:5

Re: KVM HA

2015-05-02 Thread Tilak Raj Singh
ations > Citrix Systems, Inc. > > > -Original Message- > From: Budur Nagaraju [mailto:nbud...@gmail.com] > Sent: Thursday, April 30, 2015 5:38 AM > To: dev@cloudstack.apache.org > Subject: KVM HA > > HI > New to cloud stack struggled searching for configurin

RE: KVM HA

2015-05-01 Thread Somesh Naidu
There may not be any specific KVM.HA configuration. What are you looking for? Somesh CloudPlatform Escalations Citrix Systems, Inc. -Original Message- From: Budur Nagaraju [mailto:nbud...@gmail.com] Sent: Thursday, April 30, 2015 5:38 AM To: dev@cloudstack.apache.org Subject: KVM HA

KVM HA

2015-04-30 Thread Budur Nagaraju
HI New to cloud stack struggled searching for configuring KVM HA unable to find any document . Pls any help to configure KVM HA in cloud stack ,really helps a lot. Thanks, Nagaraju

KVM HA tuning

2013-10-30 Thread Paul Angus
Hi Guys, I've been testing KVM HA in 4.2.1-SNAPSHOT and it seems to be working great. There seems to be a consistent 5 minute wait between killing a host and CloudStack marking the host and VMs as down and then carrying out HA actions. Is this delay customisable or is it hard coded some

RE: [DISCUSS] KVM HA

2013-08-07 Thread Edison Su
> -Original Message- > From: Marcus Sorensen [mailto:shadow...@gmail.com] > Sent: Wednesday, August 07, 2013 1:46 PM > To: dev@cloudstack.apache.org > Subject: Re: [DISCUSS] KVM HA > > I'm not sure we can rely on IPMI to tell us much about the host status its

RE: [DISCUSS] KVM HA

2013-08-07 Thread Edison Su
> -Original Message- > From: Marcus Sorensen [mailto:shadow...@gmail.com] > Sent: Wednesday, August 07, 2013 1:42 PM > To: dev@cloudstack.apache.org > Subject: Re: [DISCUSS] KVM HA > > Does KVMInvestigator work on all shared primary storage, or just NFS? Right now,

Re: [DISCUSS] KVM HA

2013-08-07 Thread Marcus Sorensen
nly familiar with the NFS KVMHA directories. > > From this it seems like a clean stop of the KVM agent still shouldn't > trigger any issues/HA, correct? > > On Wed, Aug 7, 2013 at 2:28 PM, Edison Su wrote: >> There is long time issue related to KVM HA, see bug: CLOUDSTACK

Re: [DISCUSS] KVM HA

2013-08-07 Thread Marcus Sorensen
ere is long time issue related to KVM HA, see bug: CLOUDSTACK-3535. > Basically, HA won't be triggered, if KVM agent is stopped either normally nor > abnormally, HA only be triggered if the network between mgt server and kvm > host is disconnected and the network between KV

[DISCUSS] KVM HA

2013-08-07 Thread Edison Su
There is long time issue related to KVM HA, see bug: CLOUDSTACK-3535. Basically, HA won't be triggered, if KVM agent is stopped either normally nor abnormally, HA only be triggered if the network between mgt server and kvm host is disconnected and the network between KVM hosts in the

RE: [URGENT] KVM HA - (FW: cs 4.1 host disconnected status)

2013-07-15 Thread Paul Angus
@cloudstack.apache.org Subject: Re: [URGENT] KVM HA - (FW: cs 4.1 host disconnected status) Hi Paul, What's the bug ID for this so we can track it properly? Thanks! Joe On Mon, Jul 15, 2013, at 02:31 AM, Paul Angus wrote: > I bumped this from the user list as we've just come across the

Re: [URGENT] KVM HA - (FW: cs 4.1 host disconnected status)

2013-07-15 Thread Marcus Sorensen
My strong preference would be to avoid any cluster locking libraries or similar on the agent side, if possible. I've just seen too many clustering products that are brittle and easily deadlock-able, where you end up having to reboot *everything* if something goes wrong on one host. It should be fa

Re: [URGENT] KVM HA - (FW: cs 4.1 host disconnected status)

2013-07-15 Thread Marcus Sorensen
For open stack, look to the current state of "evacuate". http://www.mirantis.com/blog/cloud-prizefight-vmware-vs-openstack/ "there is no official support for VM-level HA in OpenStack—it was initially planned for the Folsom release but was later dropped/postponed. There is currently an incubation

Re: [URGENT] KVM HA - (FW: cs 4.1 host disconnected status)

2013-07-15 Thread Joe Brockmeier
Hi Paul, What's the bug ID for this so we can track it properly? Thanks! Joe On Mon, Jul 15, 2013, at 02:31 AM, Paul Angus wrote: > I bumped this from the user list as we've just come across the same > issue. > > CloudStack does not react or even change host status when contact is lost > with

Re: [URGENT] KVM HA - (FW: cs 4.1 host disconnected status)

2013-07-15 Thread Shanker Balan
On 15-Jul-2013, at 12:03 PM, Chiradeep Vittal mailto:chiradeep.vit...@citrix.com>> wrote: A robust solution would probably involve Apache Zookeeper (using Curator perhaps) to perform robust distributed locking and/or leader election. Just curious - Any idea as to how OpenStack deals with a fail

Re: [URGENT] KVM HA - (FW: cs 4.1 host disconnected status)

2013-07-15 Thread Chiradeep Vittal
A robust solution would probably involve Apache Zookeeper (using Curator perhaps) to perform robust distributed locking and/or leader election. On 7/15/13 3:51 PM, "Chiradeep Vittal" wrote: >Indeed HA is very tricky as you note. In the generic case where the MS >cannot communicate with the agent

RE: [URGENT] KVM HA - (FW: cs 4.1 host disconnected status)

2013-07-15 Thread Paul Angus
- From: Chiradeep Vittal [mailto:chiradeep.vit...@citrix.com] Sent: 15 July 2013 11:21 To: dev@cloudstack.apache.org Subject: Re: [URGENT] KVM HA - (FW: cs 4.1 host disconnected status) Indeed HA is very tricky as you note. In the generic case where the MS cannot communicate with the agent, noth

Re: [URGENT] KVM HA - (FW: cs 4.1 host disconnected status)

2013-07-15 Thread Chiradeep Vittal
Indeed HA is very tricky as you note. In the generic case where the MS cannot communicate with the agent, nothing can be concluded and the MS does nothing. I dug this up and posted it to the wiki https://cwiki.apache.org/confluence/x/dwn8AQ On 7/15/13 1:20 PM, "Marcus Sorensen" wrote: >I don't

Re: [URGENT] KVM HA - (FW: cs 4.1 host disconnected status)

2013-07-15 Thread Marcus Sorensen
By the way, I'm aware that KVM has a heartbeat function in the agent, but that only works for NFS primary storage. Maybe the secondary storage could have a similar function that keeps track of running guests per host... Would still rely on the agent to not have died if the host is still up, otherwi

Re: [URGENT] KVM HA - (FW: cs 4.1 host disconnected status)

2013-07-15 Thread Marcus Sorensen
I don't know much about HA in regards to management server/agent connectivity, but it seems to me like this is perilous ground. If a host loses connection with the management server, it seems to me that the management server doesn't have the resources to determine whether it should start HA-enable

[URGENT] KVM HA - (FW: cs 4.1 host disconnected status)

2013-07-15 Thread Paul Angus
I bumped this from the user list as we've just come across the same issue. CloudStack does not react or even change host status when contact is lost with a KVM host. 2013-07-13 17:53:56,695 DEBUG [cloud.ha.AbstractInvestigatorImpl] (AgentTaskPool-1:null) host (10.0.100.51) cannot be pinged, ret