Hey Nux, There is quite a bit of tuning you can do, to speed or slow CloudStack's decision making, but we need to be sure that when we lose contact with a host agent, that the VMs themselves really are dead. By default host-ha is set to be super sure.
There are various timeouts which can be configured to decide how long to wait for a host to restart before deciding that it is not going to start as well as how many times we should check for disk activity from the resident VMs of a suspect host. The parameters are detailed here. https://cwiki.apache.org/confluence/display/CLOUDSTACK/Host+HA Honestly, the aim of Host HA was to fix the particular issue that you are describing as we can't remember a time when it did work reliably. paul.an...@shapeblue.com www.shapeblue.com 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue -----Original Message----- From: Nux! [mailto:n...@li.nux.ro] Sent: 23 January 2018 19:08 To: users <us...@cloudstack.apache.org> Cc: dev <dev@cloudstack.apache.org> Subject: Re: [VOTE] Apache Cloudstack 4.11.0.0 (LTS) Hi Paul, To be honest I do not remember when I last saw this, as I have not been testing ACS in 2017. You'd kill a HV, the VMs would pop up on another after a few minutes. Even with Host HA, the VMs remain down until the hypervisor is back up, restarted by OOBM - however if that HV has suffered a HW fault and needs to be removed, then those VM will be down for a long time ... Unless I got things quite wrong, (VM) HA - one of the big selling points of ACS - is essentially broken? -- Sent from the Delta quadrant using Borg technology! Nux! www.nux.ro ----- Original Message ----- > From: "Paul Angus" <paul.an...@shapeblue.com> > To: "users" <us...@cloudstack.apache.org>, "dev" > <dev@cloudstack.apache.org> > Sent: Tuesday, 23 January, 2018 16:02:54 > Subject: RE: [VOTE] Apache Cloudstack 4.11.0.0 (LTS) > Hi Nux, > > When have you seen the VMs on KVM behaving in the manner which you are > expecting? I recall it didn’t work that way in the mid 4.5 versions > (we found out the hard way in front of a customer) and it doesn't > behave the way you are expecting 4.9 - I've just tested it. > > You need host-ha enabled to get reliable HA in the event of a host > crash, that is why we developed the host ha feature. > > Kind regards, > > Paul Angus > > paul.an...@shapeblue.com > www.shapeblue.com > 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue > > > > > -----Original Message----- > From: Nux! [mailto:n...@li.nux.ro] > Sent: 23 January 2018 15:06 > To: dev <dev@cloudstack.apache.org> > Cc: users <us...@cloudstack.apache.org> > Subject: Re: [VOTE] Apache Cloudstack 4.11.0.0 (LTS) > > Rohit, > > I'll also have to insist with the VM HA issue. > https://issues.apache.org/jira/browse/CLOUDSTACK-10246 > > Lucian > > -- > Sent from the Delta quadrant using Borg technology! > > Nux! > www.nux.ro > > ----- Original Message ----- >> From: "Rohit Yadav" <rohit.ya...@shapeblue.com> >> To: "dev" <dev@cloudstack.apache.org>, "users" >> <us...@cloudstack.apache.org> >> Sent: Tuesday, 23 January, 2018 14:28:34 >> Subject: Re: [VOTE] Apache Cloudstack 4.11.0.0 (LTS) > >> All, >> >> >> Given we've outstanding blockers and PRs in review/testing, I'll cut >> RC2 only after we manage to get them reviewed, tested and merged. >> >> >> The outstanding PRs considered for RC2 are: >> >> https://github.com/apache/cloudstack/pull/2418 (Properly parse rules >> for security groups) >> >> https://github.com/apache/cloudstack/pull/2419 (Password server >> issue) >> >> >> In addition we've following issues to receive fixes: >> >> - VR - DHCP/dnsmasq leases issue (reported by Ozhan) >> >> - Dynamic roles upgrade fixes: >> https://issues.apache.org/jira/browse/CLOUDSTACK-10249 >> >> >> Please share any other issues you've found, or I've missed. Thanks, >> and continue testing RC1. >> >> >> - Rohit >> >> <https://cloudstack.apache.org> >> >> >> >> ________________________________ >> From: Rohit Yadav <rohit.ya...@shapeblue.com> >> Sent: Monday, January 22, 2018 11:18:27 AM >> To: Paul Angus; us...@cloudstack.apache.org; >> dev@cloudstack.apache.org >> Subject: Re: [VOTE] Apache Cloudstack 4.11.0.0 (LTS) >> >> The same issue applies to any 4.9, 4.10 release. In case of 4.9, we >> had discussed this as a doc bug and so it must be documented part of >> the 4.11 release notes as well. >> >> >> There are two ways admin can migrate to dynamic roles post-upgrade: >> >> >> 1. Enable dynamic.apichecker.enabled to true which will use the >> default api mapping of rules from 4.8 commands.properties and >> automatic annotation based and (db-backed) dynamic rules from 4.9+. >> Or, >> >> 2. The migration script is only useful where admins were not using >> the default api rule mappings and they strictly want to >> check/migrate each API. This approach requires admins to go through >> new APIs and fix commands.properties before running the migration >> scriopt (we've been sharing the new/change API list in release notes, for >> example: >> >> http://docs.cloudstack.apache.org/projects/cloudstack-release-notes/en/4.9.3.0/api-changes.html#new-api-commands). >> (for reference, doc: >> >> http://docs.cloudstack.apache.org/projects/cloudstack-administration/ >> e >> n/latest/accounts.html#using-dynamic-roles) >> >> >> Unlike the dynamic API checker, the static checker does not even >> allow the root API to access all the APIs which is why post upgrade, >> if the UI calls any API that is not allowed for the root admin (in >> this case the quotaIsEnabled API) the UI will logout the user on API >> unauthorized failure which is what happened. >> >> >> So, we can discuss two fixes: >> >> - Like dynamic checker, let the static checker allow all APIs only to >> the root admin (id=1) (I would not prefer to change the legacy >> behaviour though) >> >> - During upgrade, if commands.properties is missing we set the global >> setting to true, i.e. switch to dynamic roles (which would happen if >> someone tries to upgrade from 4.5->4.11 using a new mgmt server if >> they fail to copy the commands.properties file from /usr/share or /etc >> paths). >> >> >> Thoughts? >> >> >> - Rohit >> >> <https://cloudstack.apache.org> >> >> >> >> ________________________________ >> >> rohit.ya...@shapeblue.com >> www.shapeblue.com<http://www.shapeblue.com> >> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue >> >> >> >> >> rohit.ya...@shapeblue.com >> www.shapeblue.com >> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue >> >> >> >> From: Paul Angus >> Sent: Monday, January 22, 2018 9:24:25 AM >> To: us...@cloudstack.apache.org >> Cc: Rohit Yadav; dev@cloudstack.apache.org; Daan Hoogland >> Subject: RE: [VOTE] Apache Cloudstack 4.11.0.0 (LTS) >> >> If I've understood the issue correctly, "not being able to log in if >> upgrading >> from 4.5" is a blocker in my book. I don't think that it should be the duty >> of the Admin, to fix our oversights. Migration to the use of dynamic >> roles is also broken as the command will be missing from >> commands.properties in the first place, so the 'migrated' commands >> will not be complete. >> >> As there will need to be an RC2, IMO this upgrade issue should be >> fixed as part of it. >> >> >> >> Kind regards, >> >> Paul Angus >> >> >> VP Technology >> paul.an...@shapeblue.com >> www.shapeblue.com<http://www.shapeblue.com> >> >> >> >> >> -----Original Message----- >> From: Boris Stoyanov [mailto:boris.stoya...@shapeblue.com] >> Sent: 22 January 2018 07:31 >> To: us...@cloudstack.apache.org >> Cc: Rohit Yadav <rohit.ya...@shapeblue.com>; >> dev@cloudstack.apache.org; Daan Hoogland >> <daan.hoogl...@shapeblue.com> >> Subject: Re: [VOTE] Apache Cloudstack 4.11.0.0 (LTS) >> >> Hi Paul, >> Migration script considers only what’s in the command.properties >> file, so if the ‘missing’ quotaIsEnabled=15 is not there it will not >> create a rule for it. As Rohit mentioned it’s a duty of the admin to >> take care of aligning this up. I’m also not big fan of having this >> described in release notes, but would like to be included >> automatically during upgrade. Main argument against it, its not a blocker. >> >> Bobby. >> >> >> boris.stoya...@shapeblue.com >> www.shapeblue.com<http://www.shapeblue.com> >> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue >> >> >> >>> On 19 Jan 2018, at 19:04, Paul Angus <paul.an...@shapeblue.com> wrote: >>> >>> OK, just to confirm ‘we’ the community have basically deprecated the >>> use of commands.properties? >>> >>> But for people upgrading from a version before dynamic roles, does >>> the migration script take into account (or need to take into >>> account) the ‘missing’ >>> quotaIsEnabled=15 parameter? >>> >>> >>> >>> >>> paul.an...@shapeblue.com >>> www.shapeblue.com<http://www.shapeblue.com> >>> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue >>> >>> >>> >>> From: Rohit Yadav >>> Sent: 19 January 2018 09:27 >>> To: users <us...@cloudstack.apache.org>; dev@cloudstack.apache.org; >>> Paul Angus <paul.an...@shapeblue.com> >>> Subject: Re: [VOTE] Apache Cloudstack 4.11.0.0 (LTS) >>> >>> >>> Hi Bobby, >>> >>> >>> >>> Agree, it's not user-friendly which is why admins should migrate to >>> the dynamic roles feature. But I'm not sure if this is a blocker and >>> if an admin wants to stick to the old static (commands.properties) >>> way, they need to manage changes themselves. We may add something to >>> the release notes /cc @Paul Angus<mailto:paul.an...@shapeblue.com>. >>> >>> >>> >>> - Rohit >>> >>> >>> >>> Software Architect >>> rohit.ya...@shapeblue.com<mailto:rohit.ya...@shapeblue.com> >>> www.shapeblue.com<http://www.shapeblue.com> >>> >>> >>> >>> >>> >>> >>> >>> ________________________________ >>> From: Boris Stoyanov >>> <boris.stoya...@shapeblue.com<mailto:boris.stoya...@shapeblue.com>> >>> Sent: Friday, January 19, 2018 2:51:32 PM >>> To: users >>> Cc: dev@cloudstack.apache.org<mailto:dev@cloudstack.apache.org> >>> Subject: Re: [VOTE] Apache Cloudstack 4.11.0.0 (LTS) >>> >>> Hi Rohit, >>> >>> That doesn’t sound much user friendly what do you think? Can we look >>> for a way to automate this dependency in the upgrade process? >>> >>> Bobby. >>> >>> >>> boris.stoya...@shapeblue.com<mailto:boris.stoya...@shapeblue.com> >>> www.shapeblue.com<http://www.shapeblue.com> >>> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue >>> >>> >>> >>>> On 19 Jan 2018, at 10:50, Rohit Yadav >>>> <rohit.ya...@shapeblue.com<mailto:rohit.ya...@shapeblue.com>> wrote: >>>> >>>> Hi Bobby, >>>> >>>> >>>> I checked the 4.5-4.11 upgrade environment, due to the nature of >>>> how static checker with commands.properties work, admins will be >>>> required to add/update new API/ACLs in the commands.properties file. >>>> >>>> Adding the following to commands.properties file and restarting >>>> mgmt server fixes the issue: >>>> >>>> quotaIsEnabled=15 >>>> >>>> >>>> Please continue testing, thanks. >>>> >>>> >>>> - Rohit >>>> >>>> <https://cloudstack.apache.org> >>>> >>>> >>>> >>>> ________________________________ >>>> From: Boris Stoyanov >>>> <boris.stoya...@shapeblue.com<mailto:boris.stoya...@shapeblue.com>> >>>> Sent: Wednesday, January 17, 2018 6:54:28 PM >>>> To: us...@cloudstack.apache.org<mailto:us...@cloudstack.apache.org> >>>> Cc: dev@cloudstack.apache.org<mailto:dev@cloudstack.apache.org> >>>> Subject: Re: [VOTE] Apache Cloudstack 4.11.0.0 (LTS) >>>> >>>> I think I’ve hit a blocker when upgrading to 4.11 >>>> >>>> Here’s the jira id: >>>> https://issues.apache.org/jira/browse/CLOUDSTACK-10236 >>>> >>>> I’ve upgraded from 4.5 to 4.11, then I’ve logged in with admin and >>>> got session expired immediately. >>>> >>>> Regards, >>>> Boris Stoyanov >>>> >>>> >>>> boris.stoya...@shapeblue.com<mailto:boris.stoya...@shapeblue.com> >>>> www.shapeblue.com<http://www.shapeblue.com> >>>> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue >>>> >>>> >>>> >>>> >>>> rohit.ya...@shapeblue.com<mailto:rohit.ya...@shapeblue.com> >>>> www.shapeblue.com<http://www.shapeblue.com> >>>> 53 Chandos Place, Covent Garden, London WC2N 4HSUK >>>> @shapeblue >>>> >>>> >>>> >>>> On 17 Jan 2018, at 8:42, Tutkowski, Mike >>>> <mike.tutkow...@netapp.com<mailto:mike.tutkow...@netapp.com<mailto:mike.tutkow...@netapp.com%3cmailto:mike.tutkow...@netapp.com>>> >>>> wrote: >>>> >>>> Hi everyone, >>>> >>>> For the past couple days, I have been running the KVM managed-storage >>>> regression-test suite against RC1. >>>> >>>> With the exception of one issue (more on this below), all of these tests >>>> have >>>> passed. >>>> >>>> Tomorrow I plan to start in on the VMware-related managed-storage tests. >>>> >>>> Once I’ve completed running those, I expect to move on to the >>>> XenServer-related >>>> managed-storage tests. >>>> >>>> I ran these XenServer and VMware tests just prior to RC1 being created, so >>>> I >>>> suspect all of those tests will come back successful. >>>> >>>> Now, with regards to the one issue I found on KVM with managed storage: >>>> >>>> It relates to a new feature whereby you can online migrate the storage of >>>> a VM >>>> from NFS or Ceph to managed storage. >>>> >>>> During the code-review process, I made a change per a suggestion and it >>>> introduced an issue with this feature. The solution is just a couple lines >>>> of >>>> code and only impacts this one use case. If you are testing this release >>>> candidate and don’t really care about this particular feature, it should >>>> not at >>>> all impact your ability to test RC1. >>>> >>>> Thanks! >>>> Mike >>>> >>>> On Jan 15, 2018, at 4:33 AM, Rohit Yadav >>>> <ro...@apache.org<mailto:ro...@apache.org<mailto:ro...@apache.org%3cmailto:ro...@apache.org>>> >>>> wrote: >>>> >>>> Hi All, >>>> >>>> I've created a 4.11.0.0 release, with the following artifacts up for >>>> testing and a vote: >>>> >>>> Git Branch and Commit SH: >>>> https://gitbox.apache.org/repos/asf?p=cloudstack.git;a=shortlog;h=refs/heads/4.11.0.0-RC20180115T1603 >>>> Commit: 1b8a532ba52127f388847690df70e65c6b46f4d4 >>>> >>>> Source release (checksums and signatures are available at the same >>>> location): >>>> https://dist.apache.org/repos/dist/dev/cloudstack/4.11.0.0/ >>>> >>>> PGP release keys (signed using 5ED1E1122DC5E8A4A45112C2484248210EE3D884): >>>> https://dist.apache.org/repos/dist/release/cloudstack/KEYS >>>> >>>> The vote will be open for 72 hours. >>>> >>>> For sanity in tallying the vote, can PMC members please be sure to indicate >>>> "(binding)" with their vote? >>>> >>>> [ ] +1 approve >>>> [ ] +0 no opinion >>>> [ ] -1 disapprove (and reason why) >>>> >>>> Additional information: >>>> >>>> For users' convenience, I've built packages from >>>> 1b8a532ba52127f388847690df70e65c6b46f4d4 and published RC1 repository here: >>>> http://cloudstack.apt-get.eu/testing/4.11-rc1 >>>> >>>> The release notes are still work-in-progress, but the systemvmtemplate >>>> upgrade section has been updated. You may refer the following for >>>> systemvmtemplate upgrade testing: >>>> http://docs.cloudstack.apache.org/projects/cloudstack-release-notes/en/latest/index.html >>>> >>>> 4.11 systemvmtemplates are available from here: >>>> https://download.cloudstack.org/systemvm/4.11/ >>>> >>>> Regards, > >>> Rohit Yadav