it does not solve 100% of
> the failure of KVM HA;
>
> Because in extreme cases, the management server and the kvm host may fail
> at the same time (for example, the management server and the KVM HOST are
> placed in the same rack, and the RACK will fail at the same time after the
Thank you Nicolas and Andrija.
Even if indirect.agent.lb.algorithm is configured as roundrobin, the
probability of failure can only be reduced. But it does not solve 100% of the
failure of KVM HA;
Because in extreme cases, the management server and the kvm host may fail at
the same time (for
domingo, 23 de junio 11:03
Asunto: Re: KVM HA fails under multiple management services
Para: users
Cc: dev@cloudstack.apache.org
Li,
based on the Global Setting description for those 2, I would say that is
the expected behaviour.
i.e. change Indirect.agent.lb.check.interval to some other value,
Li,
based on the Global Setting description for those 2, I would say that is
the expected behaviour.
i.e. change Indirect.agent.lb.check.interval to some other value, since 0
means "don't check, don't reconnect" per what I read.
Also, you might want to change from Indirect.agent.lb.algorithm=sta
Hello everyone
I recently tested the multiple management services, based on agent lb HOST HA
(KVM). It was found that in extreme cases, HA would fail; the details are as
follows:
Two management nodes, M1 (172.17.1.141) and M2 (172.17.1.142), share an
external database cluster
Three KVM nodes,
Hi,
Ive added more hosts and enabled ha on all of them. Now i shoot down
node cs-hv-06, which is running r-199. Here
are the logs iam gettin.
--
2018-08-13 12:12:51,402 DEBUG [c.c.h.HighAvailabilityManagerImpl]
(pool-5-thread-1:null) (logid:b71a09c7) Notifying HA Mgr of to restart
vm 199-r-199-VM
Hi,
I have a setup with one advanced zone, one cluster and two Hosts. The
hosts are KVM and use a single NFS Storage von Primary and one for
Secondary.
Everything is running smootly until I remove power from one host.
In my honest opinion cloudstack should now delcare the faulty host as
dead, de
Github user serverchief commented on the issue:
https://github.com/apache/cloudstack/pull/1960
Hi @koushik-das
I believe you missed a discussion on this a while back - when this was
initially proposed and we were gathering community feedback.
please read this thread
Github user koushik-das commented on the issue:
https://github.com/apache/cloudstack/pull/1960
@abhinandanprateek Initially I also thought that this is about host HA. But
after reading the FS I had doubts and asked about the definition of "host HA".
If you refer to the discussion on d
Github user rhtyd commented on the issue:
https://github.com/apache/cloudstack/pull/1960
@koushik-das sorry could not get back to you earlier as I was busy with
other work. I've replied on the ML thread to address several queries [1] that
lists the advantages of this host-ha framework
Github user abhinandanprateek commented on the issue:
https://github.com/apache/cloudstack/pull/1960
@koushik-das I see that main issue is that this is being confused as VM HA
framework. Will like to again add that this framework is not for VM-HA but for
host HA. With this implementat
Github user koushik-das commented on the issue:
https://github.com/apache/cloudstack/pull/1960
There are open questions that I had asked in the dev@ list and haven't seen
satisfactory answers to them. I am -1 on this feature till the need for a new
VM HA framework is justified.
---
Github user rhtyd commented on the issue:
https://github.com/apache/cloudstack/pull/1960
Travis failed due to failure in the test
``test_ha_multiple_mgmt_server_owner...`, I'll have a look shortly
---
If your project is set up for it, you can reply to this email and have your
reply a
Github user borisstoyanov commented on the issue:
https://github.com/apache/cloudstack/pull/1960
@rhtyd there seems to be a conflict for this merge, I'm currently running
tests and will keep you posted
---
If your project is set up for it, you can reply to this email and have your
re
Github user blueorangutan commented on the issue:
https://github.com/apache/cloudstack/pull/1960
Trillian test result (tid-963)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 34953 seconds
Marvin logs:
https://github.com/blueoranguta
Github user blueorangutan commented on the issue:
https://github.com/apache/cloudstack/pull/1960
@borisstoyanov a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has
been kicked to run smoke tests
---
If your project is set up for it, you can reply to this email and have your
Github user borisstoyanov commented on the issue:
https://github.com/apache/cloudstack/pull/1960
@blueorangutan test
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and w
Github user blueorangutan commented on the issue:
https://github.com/apache/cloudstack/pull/1960
Packaging result: âcentos6 âcentos7 âdebian. JID-600
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project d
Github user borisstoyanov commented on the issue:
https://github.com/apache/cloudstack/pull/1960
@blueorangutan package
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled an
Github user blueorangutan commented on the issue:
https://github.com/apache/cloudstack/pull/1960
@borisstoyanov a Jenkins job has been kicked to build packages. I'll keep
you posted as I make progress.
---
If your project is set up for it, you can reply to this email and have your
re
Github user rhtyd commented on the issue:
https://github.com/apache/cloudstack/pull/1960
@borisstoyanov copy that, done
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled an
Github user borisstoyanov commented on the issue:
https://github.com/apache/cloudstack/pull/1960
@rhtyd PRs 2003 and 2011 got merged, could you please rebase against master
so it'll pick up the fixes. Once we package I'll continue with the verification
on the physical hosts.
Github user koushik-das commented on the issue:
https://github.com/apache/cloudstack/pull/1960
@abhinandanprateek If you refer to the discussion on dev@
https://goo.gl/cU8RuX, @rhtyd proposed it as a generic HA framework for any
resources (and not limited to VM). Now if it just a repl
Github user rhtyd commented on the issue:
https://github.com/apache/cloudstack/pull/1960
@koushik-das I've shared a list of advantages of this work over existing
framework on dev@ that explain why existing VM-HA framework cannot be used for
host-ha implementation. If you've more quest
Github user abhinandanprateek commented on the issue:
https://github.com/apache/cloudstack/pull/1960
@koushik-das The current framework is specifically implemented for Host HA
with KVM HA as the initial implementation. This framework is supposed to
replace the framework that is
Github user koushik-das commented on the issue:
https://github.com/apache/cloudstack/pull/1960
I have already raised some questions on dev@ on the need for a new HA
framework when the existing HA framework can do all the things mentioned. The
new framework only supports VM HA. If we s
Github user rhtyd commented on the issue:
https://github.com/apache/cloudstack/pull/1960
Thanks @DaanHoogland wherever applicable I'll address the comments.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user DaanHoogland commented on the issue:
https://github.com/apache/cloudstack/pull/1960
I went through the code and found no real issues. for any other reviewers I
recommend reading the FS first. It is quit a big chunk but very neat.
LGTM
---
If your project is set u
Github user borisstoyanov commented on the issue:
https://github.com/apache/cloudstack/pull/1960
@rhtyd tests looks good, except this one:
```
ERROR: Tests default ha providers list
--
Traceback (most r
Github user blueorangutan commented on the issue:
https://github.com/apache/cloudstack/pull/1960
Trillian test result (tid-879)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 28334 seconds
Marvin logs:
https://github.com/blueoranguta
Github user borisstoyanov commented on the issue:
https://github.com/apache/cloudstack/pull/1960
@blueorangutan test
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and w
Github user blueorangutan commented on the issue:
https://github.com/apache/cloudstack/pull/1960
@borisstoyanov a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has
been kicked to run smoke tests
---
If your project is set up for it, you can reply to this email and have your
Github user blueorangutan commented on the issue:
https://github.com/apache/cloudstack/pull/1960
Packaging result: âcentos6 âcentos7 âdebian. JID-525
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project d
Github user blueorangutan commented on the issue:
https://github.com/apache/cloudstack/pull/1960
@borisstoyanov a Jenkins job has been kicked to build packages. I'll keep
you posted as I make progress.
---
If your project is set up for it, you can reply to this email and have your
re
Github user borisstoyanov commented on the issue:
https://github.com/apache/cloudstack/pull/1960
@blueorangutan package
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled an
GitHub user rhtyd opened a pull request:
https://github.com/apache/cloudstack/pull/1960
CLOUDSTACK-9782: Host HA and KVM HA provider
Host-HA offers investigation, fencing and recovery mechanisms for host that
for
any reason are malfunctioning. It uses Activity and Health checks
Github user asfgit closed the pull request at:
https://github.com/apache/cloudstack/pull/1496
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user swill commented on the pull request:
https://github.com/apache/cloudstack/pull/1496#issuecomment-218056068
Thanks for confirming @koushik-das. ð It is the middle of the night
right now, so I am fading fast.
---
If your project is set up for it, you can reply to thi
Github user koushik-das commented on the pull request:
https://github.com/apache/cloudstack/pull/1496#issuecomment-218048251
@swill The failures doesn't look related to this PR. This can be merged.
---
If your project is set up for it, you can reply to this email and have your
reply a
Github user swill commented on the pull request:
https://github.com/apache/cloudstack/pull/1496#issuecomment-217884184
The two failures are not ones I am used to seeing in my environment, but I
did a quick once through of the code and I don't think they problem is related
to this PR.
Github user swill commented on the pull request:
https://github.com/apache/cloudstack/pull/1496#issuecomment-217772228
### CI RESULTS
```
Tests Run: 88
Skipped: 2
Failed: 1
Errors: 1
Duration: 11h 25m 09s
```
**Summary of the
Github user rhtyd commented on the pull request:
https://github.com/apache/cloudstack/pull/1496#issuecomment-216228687
LGTM (code review)
tag:mergeready
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your pro
Github user jburwell commented on the pull request:
https://github.com/apache/cloudstack/pull/1496#issuecomment-215890052
LGTM for code review
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user jburwell commented on a diff in the pull request:
https://github.com/apache/cloudstack/pull/1496#discussion_r61649158
--- Diff: server/src/com/cloud/ha/HighAvailabilityManagerImpl.java ---
@@ -264,6 +265,11 @@ public void scheduleRestartForVmsOnHost(final HostVO
host, b
Github user abhinandanprateek commented on the pull request:
https://github.com/apache/cloudstack/pull/1496#issuecomment-215651932
@swill rebased it. Thank you.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your proj
Github user swill commented on the pull request:
https://github.com/apache/cloudstack/pull/1496#issuecomment-215552813
@abhinandanprateek please rebase as this currently has merge conflicts with
master. Thanks...
---
If your project is set up for it, you can reply to this email and
Github user koushik-das commented on the pull request:
https://github.com/apache/cloudstack/pull/1496#issuecomment-215026662
Code changes LGTM, verified changes in HighAvailabilityManagerImpl.java.
Someone else needs to verify KVM related changes.
---
If your project is set up for it
Github user alexandrelimassantana commented on a diff in the pull request:
https://github.com/apache/cloudstack/pull/1496#discussion_r60004052
--- Diff: server/src/com/cloud/ha/HighAvailabilityManagerImpl.java ---
@@ -264,6 +265,11 @@ public void scheduleRestartForVmsOnHost(final Ho
Github user cristofolini commented on the pull request:
https://github.com/apache/cloudstack/pull/1496#issuecomment-211149034
@abhinandanprateek How about extracting lines 70-82 in `KVMInvestigator` to
their own method? I think it would be nice to get a test case going for that
sectio
Github user jburwell commented on the pull request:
https://github.com/apache/cloudstack/pull/1496#issuecomment-210619709
@abhinandanprateek Jenkins failed due to no license header in the new test
case, test_host_ha.py. Travis failed because it could not find the Marvin egg.
---
If
Github user abhinandanprateek commented on the pull request:
https://github.com/apache/cloudstack/pull/1496#issuecomment-210422617
Marvin output with only Local storage:
49b-b201830aec39
VirtualMachine on Hyp 1 = 1d500ef3-ef56-4d11-91d8-c0e37f015236
VirtualMachine on Hy
Github user abhinandanprateek commented on the pull request:
https://github.com/apache/cloudstack/pull/1496#issuecomment-210422584
Marvin test output for NFS PS:
hine on Hyp 1 = 228dc462-ceab-4225-8d35-5749cbb593c2
VirtualMachine on Hyp 1 = 2caf838f-016f-4cf6-9d35-67a2e1398
GitHub user abhinandanprateek opened a pull request:
https://github.com/apache/cloudstack/pull/1496
CLOUDSTACK-9350: KVM-HA- Fix CheckOnHost for Local storage
- KVM-HA- Fix CheckOnHost for Local storage
- Also skip HA on VMs that are using local storage
You can merge this
ntage of this approach is
a high cohesion of code support KVM integration and decouple the plugin from a
specific system management interface type. Per the FS, the KVM HA provider
would be defined with a cluster scope/partition and determine that hosts
meeting the criteria as eligible for HA:
*
On 19/10/15 23:10, ilya wrote:
Ronald,
Please see response in-line...
And you too :)
On 10/19/15 2:18 AM, Ronald van Zantvoort wrote:
On 16/10/15 00:21, ilya wrote:
I noticed several attempts to address the issue with KVM HA in Jira and
Dev ML. As we all know, there are many ways to solve
Saw this message a bit later, i tried to break it down and respond..
On 10/19/15 2:24 AM, Ronald van Zantvoort wrote:
> On 19/10/15 11:18, Ronald van Zantvoort wrote:
>> On 16/10/15 00:21, ilya wrote:
>>> I noticed several attempts to address the issue with KVM HA in Jira and
Ronald,
Please see response in-line...
On 10/19/15 2:18 AM, Ronald van Zantvoort wrote:
> On 16/10/15 00:21, ilya wrote:
>> I noticed several attempts to address the issue with KVM HA in Jira and
>> Dev ML. As we all know, there are many ways to solve the same problem,
>&g
On 19/10/15 11:18, Ronald van Zantvoort wrote:
On 16/10/15 00:21, ilya wrote:
I noticed several attempts to address the issue with KVM HA in Jira and
Dev ML. As we all know, there are many ways to solve the same problem,
on our side, we've given it some thought as well - and its on our
On 16/10/15 00:21, ilya wrote:
I noticed several attempts to address the issue with KVM HA in Jira and
Dev ML. As we all know, there are many ways to solve the same problem,
on our side, we've given it some thought as well - and its on our to do
list.
Specifically a mail thread "
I noticed several attempts to address the issue with KVM HA in Jira and
Dev ML. As we all know, there are many ways to solve the same problem,
on our side, we've given it some thought as well - and its on our to do
list.
Specifically a mail thread "KVM HA is broken, let's fix
> On 10 Oct 2015, at 12:35, Remi Bergsma wrote:
>
> Can you please explain what the issue is with KVM HA? In my tests, HA starts
> all VMs just fine without the hypervisor coming back. At least that is on
> current 4.6. Assuming a cluster of multiple nodes of course. It wi
m: "Remi Bergsma"
> To: dev@cloudstack.apache.org
> Cc: "Cloudstack Users List"
> Sent: Saturday, 10 October, 2015 11:35:36
> Subject: Re: KVM HA is broken, let's fix it
> Hi Lucian,
>
> Can you please explain what the issue is with KVM HA? In my test
Hi Lucian,
Can you please explain what the issue is with KVM HA? In my tests, HA starts
all VMs just fine without the hypervisor coming back. At least that is on
current 4.6. Assuming a cluster of multiple nodes of course. It will then do a
neighbor check from another host in the same cluster
Hello,
Following a recent thread on the users ml where slow NFS caused a mass reboot,
I have opened the following issue about improving HA on KVM.
https://issues.apache.org/jira/browse/CLOUDSTACK-8943
I know there are many people around here who use KVM and are interested in a
more robust way
This can be done by enabling HA in the service offering for an instance.
--
Sent from the Delta quadrant using Borg technology!
Nux!
www.nux.ro
- Original Message -
> From: "Budur Nagaraju"
> To: dev@cloudstack.apache.org
> Sent: Thursday, 30 April, 2015 10:37:5
ations
> Citrix Systems, Inc.
>
>
> -Original Message-
> From: Budur Nagaraju [mailto:nbud...@gmail.com]
> Sent: Thursday, April 30, 2015 5:38 AM
> To: dev@cloudstack.apache.org
> Subject: KVM HA
>
> HI
> New to cloud stack struggled searching for configurin
There may not be any specific KVM.HA configuration. What are you looking for?
Somesh
CloudPlatform Escalations
Citrix Systems, Inc.
-Original Message-
From: Budur Nagaraju [mailto:nbud...@gmail.com]
Sent: Thursday, April 30, 2015 5:38 AM
To: dev@cloudstack.apache.org
Subject: KVM HA
HI
New to cloud stack struggled searching for configuring KVM HA unable to
find any document .
Pls any help to configure KVM HA in cloud stack ,really helps a lot.
Thanks,
Nagaraju
Hi Guys,
I've been testing KVM HA in 4.2.1-SNAPSHOT and it seems to be working great.
There seems to be a consistent 5 minute wait between killing a host and
CloudStack marking the host and VMs as down and then carrying out HA actions.
Is this delay customisable or is it hard coded some
> -Original Message-
> From: Marcus Sorensen [mailto:shadow...@gmail.com]
> Sent: Wednesday, August 07, 2013 1:46 PM
> To: dev@cloudstack.apache.org
> Subject: Re: [DISCUSS] KVM HA
>
> I'm not sure we can rely on IPMI to tell us much about the host status its
> -Original Message-
> From: Marcus Sorensen [mailto:shadow...@gmail.com]
> Sent: Wednesday, August 07, 2013 1:42 PM
> To: dev@cloudstack.apache.org
> Subject: Re: [DISCUSS] KVM HA
>
> Does KVMInvestigator work on all shared primary storage, or just NFS?
Right now,
nly familiar with the NFS KVMHA directories.
>
> From this it seems like a clean stop of the KVM agent still shouldn't
> trigger any issues/HA, correct?
>
> On Wed, Aug 7, 2013 at 2:28 PM, Edison Su wrote:
>> There is long time issue related to KVM HA, see bug: CLOUDSTACK
ere is long time issue related to KVM HA, see bug: CLOUDSTACK-3535.
> Basically, HA won't be triggered, if KVM agent is stopped either normally nor
> abnormally, HA only be triggered if the network between mgt server and kvm
> host is disconnected and the network between KV
There is long time issue related to KVM HA, see bug: CLOUDSTACK-3535.
Basically, HA won't be triggered, if KVM agent is stopped either normally nor
abnormally, HA only be triggered if the network between mgt server and kvm host
is disconnected and the network between KVM hosts in the
@cloudstack.apache.org
Subject: Re: [URGENT] KVM HA - (FW: cs 4.1 host disconnected status)
Hi Paul,
What's the bug ID for this so we can track it properly?
Thanks!
Joe
On Mon, Jul 15, 2013, at 02:31 AM, Paul Angus wrote:
> I bumped this from the user list as we've just come across the
My strong preference would be to avoid any cluster locking libraries
or similar on the agent side, if possible. I've just seen too many
clustering products that are brittle and easily deadlock-able, where
you end up having to reboot *everything* if something goes wrong on
one host.
It should be fa
For open stack, look to the current state of "evacuate".
http://www.mirantis.com/blog/cloud-prizefight-vmware-vs-openstack/
"there is no official support for VM-level HA in OpenStack—it was initially
planned for the Folsom release but was later dropped/postponed. There is
currently an incubation
Hi Paul,
What's the bug ID for this so we can track it properly?
Thanks!
Joe
On Mon, Jul 15, 2013, at 02:31 AM, Paul Angus wrote:
> I bumped this from the user list as we've just come across the same
> issue.
>
> CloudStack does not react or even change host status when contact is lost
> with
On 15-Jul-2013, at 12:03 PM, Chiradeep Vittal
mailto:chiradeep.vit...@citrix.com>> wrote:
A robust solution would probably involve Apache Zookeeper (using Curator
perhaps) to perform robust distributed locking and/or leader election.
Just curious - Any idea as to how OpenStack deals with a fail
A robust solution would probably involve Apache Zookeeper (using Curator
perhaps) to perform robust distributed locking and/or leader election.
On 7/15/13 3:51 PM, "Chiradeep Vittal" wrote:
>Indeed HA is very tricky as you note. In the generic case where the MS
>cannot communicate with the agent
-
From: Chiradeep Vittal [mailto:chiradeep.vit...@citrix.com]
Sent: 15 July 2013 11:21
To: dev@cloudstack.apache.org
Subject: Re: [URGENT] KVM HA - (FW: cs 4.1 host disconnected status)
Indeed HA is very tricky as you note. In the generic case where the MS cannot
communicate with the agent, noth
Indeed HA is very tricky as you note. In the generic case where the MS
cannot communicate with the agent, nothing can be concluded and the MS
does nothing.
I dug this up and posted it to the wiki
https://cwiki.apache.org/confluence/x/dwn8AQ
On 7/15/13 1:20 PM, "Marcus Sorensen" wrote:
>I don't
By the way, I'm aware that KVM has a heartbeat function in the agent, but
that only works for NFS primary storage. Maybe the secondary storage could
have a similar function that keeps track of running guests per host...
Would still rely on the agent to not have died if the host is still up,
otherwi
I don't know much about HA in regards to management server/agent
connectivity, but it seems to me like this is perilous ground. If a
host loses connection with the management server, it seems to me that
the management server doesn't have the resources to determine whether
it should start HA-enable
I bumped this from the user list as we've just come across the same issue.
CloudStack does not react or even change host status when contact is lost with
a KVM host.
2013-07-13 17:53:56,695 DEBUG [cloud.ha.AbstractInvestigatorImpl]
(AgentTaskPool-1:null) host (10.0.100.51) cannot be pinged, ret
85 matches
Mail list logo