Re: [VOTE] Apache CloudStack 4.14.0.0 RC3

2020-05-19 Thread Boris Stoyanov
Indeed it is severe, but please note it's a corner case which was unearthed 
almost by accident. It falls down to using a new feature of selecting a boot 
protocol and the template must be corrupted. So with already existing templates 
I would not expect to encounter it. 

As for recovery, we've managed to recover vCenter and Cloudstack after reboots 
of the vCenter machine and the Cloudstack management service. There's no exact 
points to recover for now, but restart seems to work. 
By graceful failure I mean, cloudstack erroring out the deployment and VM 
finished in ERROR state, meanwhile connection and operability with vCenter 
cluster remains the same. 

We're currently exploring options to fix this, one could be to disable the 
feature for VMWare and work to introduce more sustainable fix in next release. 
Other is to look for more guarding code when installing a template, since 
VMware doesn’t actually allow you install that particular template but 
cloudstack does. We'll keep you posted. 

Thanks,
Bobby.

On 18.05.20, 23:01, "Marcus"  wrote:

The issue sounds severe enough that a release note probably won't suffice -
unless there's a documented way to recover we'd never want to leave a
system susceptible to being unrecoverable, even if it's rarely triggered.

What's involved in "failing gracefully"? Is this a small fix, or an
overhaul?  Perhaps the new feature could be disabled for VMware, or
disabled altogether until a fix is made in a patch release.

Does it only affect new templates, or is there a risk that an existing
template out in vSphere could suddenly cause problems?

On Mon, May 18, 2020 at 12:49 AM Boris Stoyanov <
boris.stoya...@shapeblue.com> wrote:

> Hi guys,
>
> A little further info on this, it appears when we use a corrupted template
> and UEFI/Legacy mode when deploy a VM, it breaks the connection between
> cloudstack and vCenter.
>
> All hosts become unreachable and basically the cluster is not functional,
> have not investigated a way to recover this but seems like a huge mess..
> Please note that user is not able to register such template in vCenter
> directly, but cloudstack allows using it.
>
> Open to discuss if we'll fix this, since it's expected users to use
> working templates, I think we should be failing gracefully and such action
> should not be able to create downtime on such a large scale.
>
> I believe the boot type feature is new one and it's not available in older
> releases, so this issue should be limited to 4.14/current master.
>
> Thanks,
> Bobby.
>
> On 15.05.20, 17:07, "Boris Stoyanov" 
> wrote:
>
> I'll have to -1 RC3, we've discovered details about an issue which is
> causing severe consequences with a particular hypervisor in the afternoon.
> We'll need more time to investigate before disclosing.
>
> Bobby.
>
> On 15.05.20, 9:12, "Boris Stoyanov" 
> wrote:
>
> +1 (binding)
>
> I've executed upgrade tests with the following configurations:
>
> 4.13.1 with KVM on CentOS7 hosts
> 4.13 with VMware6.5 hosts
> 4.11.3 with KVM on CentOS7 hosts
> 4.11.2 with XenServer7 hosts
> 4.11.1 with VMware 6.7
> 4.9.3 with XenServer 7 hosts
> 4.9.2 with KVM on CentOS 7 hosts
>
> Also I've run basic lifecycle operations on the following
> components:
> VMs
> Volumes
> Infra (zones, pod, clusters, hosts)
> Networks
> and more
>
> I did not come across any problems during this testing.
>
> Thanks,
> Bobby.
>
>
> On 11.05.20, 18:21, "Andrija Panic" 
> wrote:
>
> Hi All,
>
> I've created a 4.14.0.0 release (RC3), with the following
> artefacts up for
> testing and a vote:
>
> Git Branch and Commit SH:
>
> 
https://gitbox.apache.org/repos/asf?p=cloudstack.git;a=shortlog;h=refs/heads/4.14.0.0-RC20200511T1503
> Commit: 6f96b3b2b391a9b7d085f76bcafa3989d9832b4e
>
> Source release (checksums and signatures are available at the
> same
> location):
> https://dist.apache.org/repos/dist/dev/cloudstack/4.14.0.0/
>
> PGP release keys (signed using 3DC01AE8):
> https://dist.apache.org/repos/dist/release/cloudstack/KEYS
>
> The vote will be open until 14th May 2020, 17.00 CET (72h).
>
> For sanity in tallying the vote, can PMC members please be
> sure to indicate
> "(binding)" with their vote?
>
> [ ] +1 approve
> [ ] +0 no opinion
>

[GitHub] [cloudstack-primate] utchoang commented on pull request #320: Explore Test Automation

2020-05-19 Thread GitBox


utchoang commented on pull request #320:
URL: 
https://github.com/apache/cloudstack-primate/pull/320#issuecomment-630649462


   * Views > AutogenView.vue - processing 65%
   
![image](https://user-images.githubusercontent.com/13766648/82299763-1f97fd80-99e0-11ea-88fb-65aadef0d60d.png)
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [cloudstack-primate] svenvogel commented on pull request #252: [WIP] Add support to manage network service providers

2020-05-19 Thread GitBox


svenvogel commented on pull request #252:
URL: 
https://github.com/apache/cloudstack-primate/pull/252#issuecomment-630706423


   @rhtyd is this in a good state to merge?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [cloudstack-primate] rhtyd commented on pull request #252: [WIP] Add support to manage network service providers

2020-05-19 Thread GitBox


rhtyd commented on pull request #252:
URL: 
https://github.com/apache/cloudstack-primate/pull/252#issuecomment-630709883


   Hi @svenvogel we're currently working towards tech preview, fixing/testing 
the last set of issues on the milestone. Once that is done, we'll move to 1.0 
milestone issues and PRs. I'll ping you and @utchoang and keep everyone posted.
   
   There's also a conflict, can you address @utchoang - thanks.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [cloudstack-documentation] ACSGitBot commented on pull request #124: ref update

2020-05-19 Thread GitBox


ACSGitBot commented on pull request #124:
URL: 
https://github.com/apache/cloudstack-documentation/pull/124#issuecomment-630767706


   Your request had been received, i'll go and build the documentation and 
check the output log for errors.
   
   This shouldn't take long.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [cloudstack-documentation] andrijapanicsb opened a new pull request #124: ref update

2020-05-19 Thread GitBox


andrijapanicsb opened a new pull request #124:
URL: https://github.com/apache/cloudstack-documentation/pull/124


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [cloudstack-documentation] andrijapanicsb commented on pull request #124: ref update

2020-05-19 Thread GitBox


andrijapanicsb commented on pull request #124:
URL: 
https://github.com/apache/cloudstack-documentation/pull/124#issuecomment-630767672


   requesting docbuild



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [cloudstack-documentation] ACSGitBot commented on pull request #124: ref update

2020-05-19 Thread GitBox


ACSGitBot commented on pull request #124:
URL: 
https://github.com/apache/cloudstack-documentation/pull/124#issuecomment-630768542


   Build finished.  You can review it at:   
https://acs-www.shapeblue.com/docs/WIP-PROOFING/pr124
   
   Build Log Output:
   
   
   No log errors found to report.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [cloudstack-documentation] andrijapanicsb removed a comment on pull request #124: ref update

2020-05-19 Thread GitBox


andrijapanicsb removed a comment on pull request #124:
URL: 
https://github.com/apache/cloudstack-documentation/pull/124#issuecomment-630767672


   requesting docbuild



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [cloudstack-documentation] ACSGitBot removed a comment on pull request #124: ref update

2020-05-19 Thread GitBox


ACSGitBot removed a comment on pull request #124:
URL: 
https://github.com/apache/cloudstack-documentation/pull/124#issuecomment-630767706







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [cloudstack-documentation] ACSGitBot commented on pull request #124: ref update

2020-05-19 Thread GitBox


ACSGitBot commented on pull request #124:
URL: 
https://github.com/apache/cloudstack-documentation/pull/124#issuecomment-630773359


   Your request had been received, i'll go and build the documentation and 
check the output log for errors.
   
   This shouldn't take long.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [cloudstack-documentation] ACSGitBot commented on pull request #124: ref update

2020-05-19 Thread GitBox


ACSGitBot commented on pull request #124:
URL: 
https://github.com/apache/cloudstack-documentation/pull/124#issuecomment-630774254


   Build finished.  You can review it at:   
https://acs-www.shapeblue.com/docs/WIP-PROOFING/pr124
   
   Build Log Output:
   
   
   No log errors found to report.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [cloudstack-documentation] andrijapanicsb merged pull request #124: ref update

2020-05-19 Thread GitBox


andrijapanicsb merged pull request #124:
URL: https://github.com/apache/cloudstack-documentation/pull/124


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [cloudstack-primate] vladimirpetrov opened a new issue #346: [BUG] Read-only admin: destroy volume icon is active

2020-05-19 Thread GitBox


vladimirpetrov opened a new issue #346:
URL: https://github.com/apache/cloudstack-primate/issues/346


   **Describe the bug**
   When logged in as read-only admin (with allowed only list* actions in the 
role), the destroy volume icon is active. Since it doesn't have permissions, it 
fails with error message but still the volume is removed from the list and 
there is a neverending 'in progress' circle in the notification box.
   
   **To Reproduce**
   Steps to reproduce the behavior:
   1. Login as read-only admin user (admin role with only list* actions 
allowed).
   2. Go to Storage > Volumes, then click on a volume and press the 'Destroy' 
icon. There will be an error message in a pop-up notification but the volume is 
removed from the list.
   
   **Expected behavior**
   The 'Destroy' icon should be hidden when the user has no rights for this 
operation.
   
   **Screenshots**
   
![image](https://user-images.githubusercontent.com/12384665/82328816-0182cb00-99e9-11ea-8ba9-f14b5b23d950.png)
   
   **Desktop (please complete the following information):**
- OS: Ubuntu 18.04 LTS
- Browser: Chrome
- Version: 81.0.4044.138 (Official Build) (64-bit)
   
   **Additional context**
   None.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




Re: [VOTE] Apache CloudStack 4.14.0.0 RC3

2020-05-19 Thread Boris Stoyanov
Hi guys,

I've done more testing around this and I can now confirm it has nothing to do 
with cloudstack code. 

I've tested it with rc3, reverted UEFI PR and 4.13.1 (which does not happen to 
have the feature at all). Also I've used a matrix of VMware version of 6.0u2, 
6.5u2 and 6.7u3. 

The bug is reproducible with all the cloudstack versions, and only vmware 
6.7u3, I was not able to reproduce this with 6.5/6.0. All of my results during 
testing show it must be related to that specific version of VMware. 

Therefore I'm reversing my '-1' and giving a +1 vote on the RC. I think it 
needs to be included in release notes to refrain from that version for now 
until further investigation is done. 

Thanks,
Bobby.

On 19.05.20, 10:08, "Boris Stoyanov"  wrote:

Indeed it is severe, but please note it's a corner case which was unearthed 
almost by accident. It falls down to using a new feature of selecting a boot 
protocol and the template must be corrupted. So with already existing templates 
I would not expect to encounter it. 

As for recovery, we've managed to recover vCenter and Cloudstack after 
reboots of the vCenter machine and the Cloudstack management service. There's 
no exact points to recover for now, but restart seems to work. 
By graceful failure I mean, cloudstack erroring out the deployment and VM 
finished in ERROR state, meanwhile connection and operability with vCenter 
cluster remains the same. 

We're currently exploring options to fix this, one could be to disable the 
feature for VMWare and work to introduce more sustainable fix in next release. 
Other is to look for more guarding code when installing a template, since 
VMware doesn’t actually allow you install that particular template but 
cloudstack does. We'll keep you posted. 

Thanks,
Bobby.

On 18.05.20, 23:01, "Marcus"  wrote:

The issue sounds severe enough that a release note probably won't 
suffice -
unless there's a documented way to recover we'd never want to leave a
system susceptible to being unrecoverable, even if it's rarely 
triggered.

What's involved in "failing gracefully"? Is this a small fix, or an
overhaul?  Perhaps the new feature could be disabled for VMware, or
disabled altogether until a fix is made in a patch release.

Does it only affect new templates, or is there a risk that an existing
template out in vSphere could suddenly cause problems?

On Mon, May 18, 2020 at 12:49 AM Boris Stoyanov <
boris.stoya...@shapeblue.com> wrote:

> Hi guys,
>
> A little further info on this, it appears when we use a corrupted 
template
> and UEFI/Legacy mode when deploy a VM, it breaks the connection 
between
> cloudstack and vCenter.
>
> All hosts become unreachable and basically the cluster is not 
functional,
> have not investigated a way to recover this but seems like a huge 
mess..
> Please note that user is not able to register such template in vCenter
> directly, but cloudstack allows using it.
>
> Open to discuss if we'll fix this, since it's expected users to use
> working templates, I think we should be failing gracefully and such 
action
> should not be able to create downtime on such a large scale.
>
> I believe the boot type feature is new one and it's not available in 
older
> releases, so this issue should be limited to 4.14/current master.
>
> Thanks,
> Bobby.
>
> On 15.05.20, 17:07, "Boris Stoyanov" 
> wrote:
>
> I'll have to -1 RC3, we've discovered details about an issue 
which is
> causing severe consequences with a particular hypervisor in the 
afternoon.
> We'll need more time to investigate before disclosing.
>
> Bobby.
>
> On 15.05.20, 9:12, "Boris Stoyanov" 
> wrote:
>
> +1 (binding)
>
> I've executed upgrade tests with the following configurations:
>
> 4.13.1 with KVM on CentOS7 hosts
> 4.13 with VMware6.5 hosts
> 4.11.3 with KVM on CentOS7 hosts
> 4.11.2 with XenServer7 hosts
> 4.11.1 with VMware 6.7
> 4.9.3 with XenServer 7 hosts
> 4.9.2 with KVM on CentOS 7 hosts
>
> Also I've run basic lifecycle operations on the following
> components:
> VMs
> Volumes
> Infra (zones, pod, clusters, hosts)
> Networks
> and more
>
> I did not come across any problems during this testing.
>
> Thanks,
> Bobby.
>
>
> On 11.05.2

Re: [VOTE] Apache CloudStack 4.14.0.0 RC3

2020-05-19 Thread Daan Hoogland
Thanks Bobby,
All, I've been closely working with Bobby and seen the same things. Does
anybody see any issues releasing 4.14 based on this code? I can confirm
that it is not Pavernalli's UEFI PR and we should not create a new PR to
revert it.
thanks for all of your patience,

(this is me giving a binding +1)


On Tue, May 19, 2020 at 5:04 PM Boris Stoyanov 
wrote:

> Hi guys,
>
> I've done more testing around this and I can now confirm it has nothing to
> do with cloudstack code.
>
> I've tested it with rc3, reverted UEFI PR and 4.13.1 (which does not
> happen to have the feature at all). Also I've used a matrix of VMware
> version of 6.0u2, 6.5u2 and 6.7u3.
>
> The bug is reproducible with all the cloudstack versions, and only vmware
> 6.7u3, I was not able to reproduce this with 6.5/6.0. All of my results
> during testing show it must be related to that specific version of VMware.
>
> Therefore I'm reversing my '-1' and giving a +1 vote on the RC. I think it
> needs to be included in release notes to refrain from that version for now
> until further investigation is done.
>
> Thanks,
> Bobby.
>
> On 19.05.20, 10:08, "Boris Stoyanov" 
> wrote:
>
> Indeed it is severe, but please note it's a corner case which was
> unearthed almost by accident. It falls down to using a new feature of
> selecting a boot protocol and the template must be corrupted. So with
> already existing templates I would not expect to encounter it.
>
> As for recovery, we've managed to recover vCenter and Cloudstack after
> reboots of the vCenter machine and the Cloudstack management service.
> There's no exact points to recover for now, but restart seems to work.
> By graceful failure I mean, cloudstack erroring out the deployment and
> VM finished in ERROR state, meanwhile connection and operability with
> vCenter cluster remains the same.
>
> We're currently exploring options to fix this, one could be to disable
> the feature for VMWare and work to introduce more sustainable fix in next
> release. Other is to look for more guarding code when installing a
> template, since VMware doesn’t actually allow you install that particular
> template but cloudstack does. We'll keep you posted.
>
> Thanks,
> Bobby.
>
> On 18.05.20, 23:01, "Marcus"  wrote:
>
> The issue sounds severe enough that a release note probably won't
> suffice -
> unless there's a documented way to recover we'd never want to
> leave a
> system susceptible to being unrecoverable, even if it's rarely
> triggered.
>
> What's involved in "failing gracefully"? Is this a small fix, or an
> overhaul?  Perhaps the new feature could be disabled for VMware, or
> disabled altogether until a fix is made in a patch release.
>
> Does it only affect new templates, or is there a risk that an
> existing
> template out in vSphere could suddenly cause problems?
>
> On Mon, May 18, 2020 at 12:49 AM Boris Stoyanov <
> boris.stoya...@shapeblue.com> wrote:
>
> > Hi guys,
> >
> > A little further info on this, it appears when we use a
> corrupted template
> > and UEFI/Legacy mode when deploy a VM, it breaks the connection
> between
> > cloudstack and vCenter.
> >
> > All hosts become unreachable and basically the cluster is not
> functional,
> > have not investigated a way to recover this but seems like a
> huge mess..
> > Please note that user is not able to register such template in
> vCenter
> > directly, but cloudstack allows using it.
> >
> > Open to discuss if we'll fix this, since it's expected users to
> use
> > working templates, I think we should be failing gracefully and
> such action
> > should not be able to create downtime on such a large scale.
> >
> > I believe the boot type feature is new one and it's not
> available in older
> > releases, so this issue should be limited to 4.14/current master.
> >
> > Thanks,
> > Bobby.
> >
> > On 15.05.20, 17:07, "Boris Stoyanov" <
> boris.stoya...@shapeblue.com>
> > wrote:
> >
> > I'll have to -1 RC3, we've discovered details about an issue
> which is
> > causing severe consequences with a particular hypervisor in the
> afternoon.
> > We'll need more time to investigate before disclosing.
> >
> > Bobby.
> >
> > On 15.05.20, 9:12, "Boris Stoyanov" <
> boris.stoya...@shapeblue.com>
> > wrote:
> >
> > +1 (binding)
> >
> > I've executed upgrade tests with the following
> configurations:
> >
> > 4.13.1 with KVM on CentOS7 hosts
> > 4.13 with VMware6.5 hosts
> > 4.11.3 with KVM on CentOS7 hosts
> > 4.11.2 with XenServer7 hosts
> > 4.11.1 with VM

Re: [VOTE] Apache CloudStack 4.14.0.0 RC3

2020-05-19 Thread Pavan Kumar Aravapalli
Thank you Bobby and Daan for the update. However I have not encountered such 
issue while doing dev test with Vmware 5.5 & 6.5.





Regards,

Pavan Aravapalli.



From: Daan Hoogland 
Sent: 19 May 2020 20:56
To: users 
Cc: dev@cloudstack.apache.org 
Subject: Re: [VOTE] Apache CloudStack 4.14.0.0 RC3

Thanks Bobby,
All, I've been closely working with Bobby and seen the same things. Does
anybody see any issues releasing 4.14 based on this code? I can confirm
that it is not Pavernalli's UEFI PR and we should not create a new PR to
revert it.
thanks for all of your patience,

(this is me giving a binding +1)


On Tue, May 19, 2020 at 5:04 PM Boris Stoyanov 
wrote:

> Hi guys,
>
> I've done more testing around this and I can now confirm it has nothing to
> do with cloudstack code.
>
> I've tested it with rc3, reverted UEFI PR and 4.13.1 (which does not
> happen to have the feature at all). Also I've used a matrix of VMware
> version of 6.0u2, 6.5u2 and 6.7u3.
>
> The bug is reproducible with all the cloudstack versions, and only vmware
> 6.7u3, I was not able to reproduce this with 6.5/6.0. All of my results
> during testing show it must be related to that specific version of VMware.
>
> Therefore I'm reversing my '-1' and giving a +1 vote on the RC. I think it
> needs to be included in release notes to refrain from that version for now
> until further investigation is done.
>
> Thanks,
> Bobby.
>
> On 19.05.20, 10:08, "Boris Stoyanov" 
> wrote:
>
> Indeed it is severe, but please note it's a corner case which was
> unearthed almost by accident. It falls down to using a new feature of
> selecting a boot protocol and the template must be corrupted. So with
> already existing templates I would not expect to encounter it.
>
> As for recovery, we've managed to recover vCenter and Cloudstack after
> reboots of the vCenter machine and the Cloudstack management service.
> There's no exact points to recover for now, but restart seems to work.
> By graceful failure I mean, cloudstack erroring out the deployment and
> VM finished in ERROR state, meanwhile connection and operability with
> vCenter cluster remains the same.
>
> We're currently exploring options to fix this, one could be to disable
> the feature for VMWare and work to introduce more sustainable fix in next
> release. Other is to look for more guarding code when installing a
> template, since VMware doesn’t actually allow you install that particular
> template but cloudstack does. We'll keep you posted.
>
> Thanks,
> Bobby.
>
> On 18.05.20, 23:01, "Marcus"  wrote:
>
> The issue sounds severe enough that a release note probably won't
> suffice -
> unless there's a documented way to recover we'd never want to
> leave a
> system susceptible to being unrecoverable, even if it's rarely
> triggered.
>
> What's involved in "failing gracefully"? Is this a small fix, or an
> overhaul?  Perhaps the new feature could be disabled for VMware, or
> disabled altogether until a fix is made in a patch release.
>
> Does it only affect new templates, or is there a risk that an
> existing
> template out in vSphere could suddenly cause problems?
>
> On Mon, May 18, 2020 at 12:49 AM Boris Stoyanov <
> boris.stoya...@shapeblue.com> wrote:
>
> > Hi guys,
> >
> > A little further info on this, it appears when we use a
> corrupted template
> > and UEFI/Legacy mode when deploy a VM, it breaks the connection
> between
> > cloudstack and vCenter.
> >
> > All hosts become unreachable and basically the cluster is not
> functional,
> > have not investigated a way to recover this but seems like a
> huge mess..
> > Please note that user is not able to register such template in
> vCenter
> > directly, but cloudstack allows using it.
> >
> > Open to discuss if we'll fix this, since it's expected users to
> use
> > working templates, I think we should be failing gracefully and
> such action
> > should not be able to create downtime on such a large scale.
> >
> > I believe the boot type feature is new one and it's not
> available in older
> > releases, so this issue should be limited to 4.14/current master.
> >
> > Thanks,
> > Bobby.
> >
> > On 15.05.20, 17:07, "Boris Stoyanov" <
> boris.stoya...@shapeblue.com>
> > wrote:
> >
> > I'll have to -1 RC3, we've discovered details about an issue
> which is
> > causing severe consequences with a particular hypervisor in the
> afternoon.
> > We'll need more time to investigate before disclosing.
> >
> > Bobby.
> >
> > On 15.05.20, 9:12, "Boris Stoyanov" <
> boris.stoya...@shapeblue.com>
> > wrote:
> >
> > +1 (binding)
>

Re: [VOTE] Apache CloudStack 4.14.0.0 RC3

2020-05-19 Thread Andrija Panic
Hi all,

In my humble opinion, we should release 4.14 as it is (considering we have
enough votes), but we'll further investigate the actual/behind-the-scene
root-cause for the vSphere 6.7 harakiri (considering 6.0 and 6.5 are not
affected) - this is possibly a VMware bug and we'll certainly try to
address it.

If I don't hear any more concerns or -1 votes until tomorrow morning CET
time, I will proceed with concluding the voting process and crafting the
release.

Thanks,
Andrija

On Tue, 19 May 2020 at 19:23, Pavan Kumar Aravapalli <
pavankuma...@accelerite.com> wrote:

> Thank you Bobby and Daan for the update. However I have not encountered
> such issue while doing dev test with Vmware 5.5 & 6.5.
>
>
>
>
>
> Regards,
>
> Pavan Aravapalli.
>
>
> 
> From: Daan Hoogland 
> Sent: 19 May 2020 20:56
> To: users 
> Cc: dev@cloudstack.apache.org 
> Subject: Re: [VOTE] Apache CloudStack 4.14.0.0 RC3
>
> Thanks Bobby,
> All, I've been closely working with Bobby and seen the same things. Does
> anybody see any issues releasing 4.14 based on this code? I can confirm
> that it is not Pavernalli's UEFI PR and we should not create a new PR to
> revert it.
> thanks for all of your patience,
>
> (this is me giving a binding +1)
>
>
> On Tue, May 19, 2020 at 5:04 PM Boris Stoyanov <
> boris.stoya...@shapeblue.com>
> wrote:
>
> > Hi guys,
> >
> > I've done more testing around this and I can now confirm it has nothing
> to
> > do with cloudstack code.
> >
> > I've tested it with rc3, reverted UEFI PR and 4.13.1 (which does not
> > happen to have the feature at all). Also I've used a matrix of VMware
> > version of 6.0u2, 6.5u2 and 6.7u3.
> >
> > The bug is reproducible with all the cloudstack versions, and only vmware
> > 6.7u3, I was not able to reproduce this with 6.5/6.0. All of my results
> > during testing show it must be related to that specific version of
> VMware.
> >
> > Therefore I'm reversing my '-1' and giving a +1 vote on the RC. I think
> it
> > needs to be included in release notes to refrain from that version for
> now
> > until further investigation is done.
> >
> > Thanks,
> > Bobby.
> >
> > On 19.05.20, 10:08, "Boris Stoyanov" 
> > wrote:
> >
> > Indeed it is severe, but please note it's a corner case which was
> > unearthed almost by accident. It falls down to using a new feature of
> > selecting a boot protocol and the template must be corrupted. So with
> > already existing templates I would not expect to encounter it.
> >
> > As for recovery, we've managed to recover vCenter and Cloudstack
> after
> > reboots of the vCenter machine and the Cloudstack management service.
> > There's no exact points to recover for now, but restart seems to work.
> > By graceful failure I mean, cloudstack erroring out the deployment
> and
> > VM finished in ERROR state, meanwhile connection and operability with
> > vCenter cluster remains the same.
> >
> > We're currently exploring options to fix this, one could be to
> disable
> > the feature for VMWare and work to introduce more sustainable fix in next
> > release. Other is to look for more guarding code when installing a
> > template, since VMware doesn’t actually allow you install that particular
> > template but cloudstack does. We'll keep you posted.
> >
> > Thanks,
> > Bobby.
> >
> > On 18.05.20, 23:01, "Marcus"  wrote:
> >
> > The issue sounds severe enough that a release note probably won't
> > suffice -
> > unless there's a documented way to recover we'd never want to
> > leave a
> > system susceptible to being unrecoverable, even if it's rarely
> > triggered.
> >
> > What's involved in "failing gracefully"? Is this a small fix, or
> an
> > overhaul?  Perhaps the new feature could be disabled for VMware,
> or
> > disabled altogether until a fix is made in a patch release.
> >
> > Does it only affect new templates, or is there a risk that an
> > existing
> > template out in vSphere could suddenly cause problems?
> >
> > On Mon, May 18, 2020 at 12:49 AM Boris Stoyanov <
> > boris.stoya...@shapeblue.com> wrote:
> >
> > > Hi guys,
> > >
> > > A little further info on this, it appears when we use a
> > corrupted template
> > > and UEFI/Legacy mode when deploy a VM, it breaks the connection
> > between
> > > cloudstack and vCenter.
> > >
> > > All hosts become unreachable and basically the cluster is not
> > functional,
> > > have not investigated a way to recover this but seems like a
> > huge mess..
> > > Please note that user is not able to register such template in
> > vCenter
> > > directly, but cloudstack allows using it.
> > >
> > > Open to discuss if we'll fix this, since it's expected users to
> > use
> > > working templates, I think we should be failing gracefully and
> > such action
> > > should not be able 

RE: Virtual machines volume lock manager

2020-05-19 Thread Sean Lair
Are you using NFS?

Yea, we implmented locking because of that problem:

https://libvirt.org/locking-lockd.html

echo lock_manager = \"lockd\" >> /etc/libvirt/qemu.conf

-Original Message-
From: Andrija Panic  
Sent: Wednesday, October 30, 2019 6:55 AM
To: dev 
Cc: users 
Subject: Re: Virtual machines volume lock manager

I would advise trying to reproduce.

start migration, then either:
- configure timeout so that it''s way too low, so that migration fails due to 
timeouts.
- restart mgmt server in the middle of migrations This should cause migration 
to fail - and you can observe if you have reproduced the problem.
keep in mind, that there might be some garbage left, due to not-properly 
handling the failed migration But from QEMU point of view - if migration fails, 
by all means the new VM should be destroyed...



On Wed, 30 Oct 2019 at 11:31, Rakesh Venkatesh 

wrote:

> Hi Andrija
>
>
> Sorry for the late reply.
>
> Im using 4.7 version of ACS. Qemu version 1:2.5+dfsg-5ubuntu10.40
>
> Im not sure if ACS job failed or libvirt job as I didnt see into logs.
> Yes the vm will be in paused state during migration but after the 
> failed migration, the same vm was in "running" state on two different 
> hypervisors.
> We wrote a script to find out how duplicated vm's are running and 
> found out that more than 5 vm's had this issue.
>
>
> On Mon, Oct 28, 2019 at 2:42 PM Andrija Panic 
> 
> wrote:
>
> > I've been running KVM public cloud up to recently and have never 
> > seen
> such
> > behaviour.
> >
> > What versions (ACS, qemu, libvrit) are you running?
> >
> > How does the migration fail - ACS job - or libvirt job?
> > destination VM is by default always in PAUSED state, until the 
> > migration
> is
> > finished - only then the destination VM (on the new host) will get
> RUNNING,
> > while previously pausing the original VM (on the old host).
> >
> > i,e.
> > phase1  source vm RUNNING, destination vm PAUSED (RAM content being
> > copied over... takes time...)
> > phase2  source vm PAUSED, destination vm PAUSED (last bits of RAM
> > content are migrated)
> > phase3  source vm destroyed, destination VM RUNNING.
> >
> > Andrija
> >
> > On Mon, 28 Oct 2019 at 14:26, Rakesh Venkatesh <
> http://sea.ippathways.com:32224/?dmVyPTEuMDAxJiYzM2ZmODRmOWFhMzdmZmQ1O
> T01REI5N0ExQV84NTE5N18yMDM4OV8xJiZjZjE2YzBlNTI0N2VmMjM9MTIzMyYmdXJsPXd
> 3dyUyRXJha2VzaHYlMkVjb20=@gmail.com>
> > wrote:
> >
> > > Hello Users
> > >
> > >
> > > Recently we have seen cases where when the Vm migration fails,
> cloudstack
> > > ends up running two instances of the same VM on different hypervisors.
> > The
> > > state will be "running" and not any other transition state. This 
> > > will
> of
> > > course lead to corruption of disk. Does CloudStack has any option 
> > > of
> > volume
> > > locking so that two instances of the same VM wont be running?
> > > Anyone else has faced this issue and found some solution to fix it?
> > >
> > > We are thinking of using "virtlockd" of libvirt or implementing 
> > > custom
> > lock
> > > mechanisms. There are some pros and cons of the both the solutions 
> > > and
> i
> > > want your feedback before proceeding further.
> > >
> > > --
> > > Thanks and regards
> > > Rakesh venkatesh
> > >
> >
> >
> > --
> >
> > Andrija Panić
> >
>
>
> --
> Thanks and regards
> Rakesh venkatesh
>


-- 

Andrija Panić


[GitHub] [cloudstack-primate] Android1968 commented on issue #62: Custom Upload Action - Template, ISO, volume

2020-05-19 Thread GitBox


Android1968 commented on issue #62:
URL: 
https://github.com/apache/cloudstack-primate/issues/62#issuecomment-631185276


   Guys same error here,any detailed solutions?
   ![iso 
error1](https://user-images.githubusercontent.com/58725374/82395961-4eb17c00-9a7f-11ea-9d06-292c6a88f685.png)
   ![iso 
error](https://user-images.githubusercontent.com/58725374/82395963-507b3f80-9a7f-11ea-9631-7470a549777b.png)
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org