Re: KVM host agent disconnection

2016-02-06 Thread Indra Pramana
Hi Wido,

Good day to you, and thanks for your reply. Nice to hear from you again. :)

So is this caused by a bug on 4.2 which is resolved on newer version of
ACS? Any specific information on the bug, e.g. bug ID and description on
how to fix it? Is there a way I can resolve the problem without having to
upgrade?

Is there any documentation I can follow on how to upgrade from 4.2 to 4.8?
Will this be quite straight-forward or will this involve many steps? We are
running a production environment and we don't have staging / test
environment to play with.

Looking forward to your reply, thank you.

Cheers.

On Sat, Feb 6, 2016 at 3:48 PM, Wido den Hollander  wrote:

> Hi,
>
> > Op 5 februari 2016 om 17:24 schreef Indra Pramana :
> >
> >
> > Dear all,
> >
> > We are using CloudStack 4.2.0, KVM hypervisor and Ceph RBD for primary
> > storage. In the past one week, many of our KVM host agents would often be
> > disconnected from the management server, causing the VMs to go down
> because
> > of HA work. While we used to have host disconnection in the past,
> normally
> > it would only affect just one host, but this time round, when the problem
> > happens, it would happen on multiple hosts, up to 4-5 hosts at the same
> > time.
> >
>
> Any reason to still run 4.2? I've seen this happen as well and I haven't
> seen
> this with recent versions of ACS.
>
> Could you maybe upgrade to 4.8?
>
> Wido
>
> > Nothing much I can find on both the management-server.log and agent.log,
> > with no significant warn, error or exceptions logged before the
> > disconnection. Here are the sample logs from the agent:
> >
> > ===
> > 2016-02-05 03:20:28,820 ERROR [cloud.agent.Agent] (UgentTask-7:null) Ping
> > Interval has gone past 30.  Attempting to reconnect.
> > 2016-02-05 03:20:28,825 DEBUG [cloud.agent.Agent] (UgentTask-7:null)
> > Clearing watch list: 2
> > 2016-02-05 03:20:28,825 DEBUG [utils.nio.NioConnection]
> > (Agent-Selector:null) Closing socket
> > Socket[addr=/*.*.3.3,port=8250,localport=50489]
> > 2016-02-05 03:20:33,825 INFO  [cloud.agent.Agent] (UgentTask-7:null) Lost
> > connection to the server. Dealing with the remaining commands...
> > 2016-02-05 03:20:38,826 INFO  [cloud.agent.Agent] (UgentTask-7:null)
> > Reconnecting...
> > 2016-02-05 03:20:38,829 INFO  [utils.nio.NioClient] (Agent-Selector:null)
> > Connecting to *.*.3.3:8250
> > 2016-02-05 03:20:38,925 INFO  [utils.nio.NioClient] (Agent-Selector:null)
> > SSL: Handshake done
> > 2016-02-05 03:20:38,926 INFO  [utils.nio.NioClient] (Agent-Selector:null)
> > Connected to *.*.3.3:8250
> > 2016-02-05 03:20:43,926 INFO  [cloud.agent.Agent] (UgentTask-7:null)
> > Connected to the server
> > ===
> >
> > Sometimes, the Cloudstack agent will not be able to re-connect unless if
> we
> > stop and start the agent again manually:
> >
> > ===
> > 2016-02-05 03:22:20,330 ERROR [cloud.agent.Agent] (UgentTask-6:null) Ping
> > Interval has gone past 30.  Attempting to reconnect.
> > 2016-02-05 03:22:20,331 DEBUG [cloud.agent.Agent] (UgentTask-6:null)
> > Clearing watch list: 2
> > 2016-02-05 03:22:20,353 DEBUG [utils.nio.NioConnection]
> > (Agent-Selector:null) Closing socket
> > Socket[addr=/*.*.3.3,port=8250,localport=46231]
> > 2016-02-05 03:22:25,332 INFO  [cloud.agent.Agent] (UgentTask-6:null) Lost
> > connection to the server. Dealing with the remaining commands...
> > 2016-02-05 03:22:25,332 INFO  [cloud.agent.Agent] (UgentTask-6:null)
> Cannot
> > connect because we still have 3 commands in progress.
> > 2016-02-05 03:22:30,333 INFO  [cloud.agent.Agent] (UgentTask-6:null) Lost
> > connection to the server. Dealing with the remaining commands...
> > 2016-02-05 03:22:30,333 INFO  [cloud.agent.Agent] (UgentTask-6:null)
> Cannot
> > connect because we still have 3 commands in progress.
> > 2016-02-05 03:22:35,333 INFO  [cloud.agent.Agent] (UgentTask-6:null) Lost
> > connection to the server. Dealing with the remaining commands...
> > 2016-02-05 03:22:35,334 INFO  [cloud.agent.Agent] (UgentTask-6:null)
> Cannot
> > connect because we still have 3 commands in progress.
> > 2016-02-05 03:22:40,334 INFO  [cloud.agent.Agent] (UgentTask-6:null) Lost
> > connection to the server. Dealing with the remaining commands...
> > 2016-02-05 03:22:40,335 INFO  [cloud.agent.Agent] (UgentTask-6:null)
> Cannot
> > connect because we still have 3 commands in progress.
> > 2016-02-05 03:22:45,335 INFO  [cloud.agent.Agent] (UgentTask-6:null) Lost
> > connection to the server. Dealing with the remaining commands...
> > 2016-02-05 03:22:45,335 INFO  [cloud.agent.Agent] (UgentTask-6:null)
> Cannot
> > connect because we still have 3 commands in progress.
> > 2016-02-05 03:22:50,336 INFO  [cloud.agent.Agent] (UgentTask-6:null) Lost
> > connection to the server. Dealing with the remaining commands...
> > 2016-02-05 03:22:50,336 INFO  [cloud.agent.Agent] (UgentTask-6:null)
> Cannot
> > connect because we still have 3 commands in progress.
> > 2016-02-05 03:22:55,337 INFO  [cloud.agent.Agent

[GitHub] cloudstack pull request: Check the existence of 'forceencap' param...

2016-02-06 Thread bhaisaab
Github user bhaisaab commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1402#discussion_r52102132
  
--- Diff: systemvm/patches/debian/config/opt/cloud/bin/configure.py ---
@@ -531,6 +531,8 @@ def configure_ipsec(self, obj):
 file.addeq(" pfs=%s" % CsHelper.bool_to_yn(obj['dpd']))
 file.addeq(" keyingtries=2")
 file.addeq(" auto=start")
+if not obj.has_key('encap'):
--- End diff --

Consider using: if 'encap' not in obj, instead of has_key(). More pythonic 
that way ;)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] cloudstack pull request: Taking fast and efficient volume snapshot...

2016-02-06 Thread mike-tutkowski
GitHub user mike-tutkowski opened a pull request:

https://github.com/apache/cloudstack/pull/1403

Taking fast and efficient volume snapshots with XenServer (and your storage 
provider)

A XenServer storage repository (SR) and virtual disk image (VDI) each have 
UUIDs that are immutable.

This poses a problem for SAN snapshots, if you intend on mounting the 
underlying snapshot SR alongside the source SR (duplicate UUIDs).

VMware has a solution for this called re-signaturing (so, in other words, 
the snapshot UUIDs can be changed).

This PR only deals with the CloudStack side of things, but it works in 
concert with a new XenServer storage manager created by CloudOps (this storage 
manager enables re-signaturing of XenServer SR and VDI UUIDs).

I have written Marvin integration tests to go along with this, but cannot 
yet check those into the CloudStack repo as they rely on SolidFire hardware.

If anyone would like to see these integration tests, please let me know.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mike-tutkowski/cloudstack xs-snapshots

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/cloudstack/pull/1403.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1403


commit 34dc4d7ca9cdc66fc16301d0f8a8ffef790e7462
Author: Mike Tutkowski 
Date:   2015-11-16T19:18:25Z

Support for backend snapshots with XenServer

commit c0737879d7e0a5a8e991ab1f97c16db10fc79132
Author: Mike Tutkowski 
Date:   2016-01-04T05:36:52Z

Initial changes to make use of both SolidFire snapshots and SolidFire 
cloning

commit bae4c2780d30194da6149b98248e6fcd0e4faa84
Author: Mike Tutkowski 
Date:   2015-11-02T23:58:09Z

Refactoring and enhancements to SolidFire Integration-Testing API Plug-in

commit 8c551edca3fa7ebdc86e69fd747bb8a14d4ed178
Author: Mike Tutkowski 
Date:   2016-01-05T08:32:39Z

Only "permanently" make use of cloning a volume from a snapshot when 
creating a new CloudStack volume from a volume snapshot

commit db3b66dfd6b172d7dd1a731abe200ef563f1439d
Author: Mike Tutkowski 
Date:   2016-01-07T19:13:53Z

Access-related optimization for creating a template from a snapshot

commit a0a8da3b0044d1b388a8a0aa9e8f492b067eb807
Author: Mike Tutkowski 
Date:   2016-01-07T19:16:48Z

Correction to used-space calculation

commit 054f110d47543c966a83cf797fea9a0d2046e6af
Author: Mike Tutkowski 
Date:   2016-01-08T01:27:48Z

Do not check for remaining volume snapshots before deleting a volume 
snapshot that is supported by a back-end volume (only do this when deleting a 
volume snapshot that is supported by a back-end snapshot)

commit d8cf010117f664854c82b68a2d4a0d9b4b1e7c25
Author: Mike Tutkowski 
Date:   2016-01-08T02:27:57Z

When deleting a CloudStack volume that is tied to a SolidFire volume, only 
delete the SolidFire volume if the CloudStack volume's volume snapshots (if 
any) are all supported by SolidFire volumes (as opposed to any being supported 
by SolidFire snapshots)

commit 8655fb1aa3a1cc6ee5443607836a74b103814b02
Author: Mike Tutkowski 
Date:   2016-01-08T20:22:43Z

"=" should be "=="

commit 836c9a5b8ae6b2b8175166ba1418b9f59314cb4b
Author: Mike Tutkowski 
Date:   2016-01-13T19:26:25Z

For integration-test purposes: Get snapshot details

commit 9528ea8d23ac41b7f5d9735ec29572781ee16e27
Author: Mike Tutkowski 
Date:   2016-01-20T02:07:42Z

Enabling support for arbitrary key/value pairs to more easily be stored for 
SolidFire volumes

commit acf15ed6af5ff7090b2b71bef9f70a72f87cab48
Author: Mike Tutkowski 
Date:   2016-01-20T22:57:32Z

Enable the use of arbitrary key/value pairs when creating a SolidFire 
volume via cloning and when creating a SolidFire snapshot

commit 1c9516ab27923caa845bf99a3a2eab406a9d7a6f
Author: Mike Tutkowski 
Date:   2016-01-20T23:57:12Z

Improved exception handling

commit 7625e188d264633beaf30b1ca04779c1890d02f6
Author: Mike Tutkowski 
Date:   2016-01-22T18:43:18Z

The way resigning metadata is invoked has changed. Call SR.create with type 
RELVMOISCSI. An exception should be thrown when the time would otherwise have 
come for the create functionality to attach the SR. Check if "success" is 
returned. If so, invoke SR.introduce; else, re-throw the exception.

commit a0fdf10246aa0a6f123b8583862322d30ecf0f38
Author: Mike Tutkowski 
Date:   2016-01-29T19:32:26Z

Check if hostVO == null (not if hostId == null)

commit e851a40caf1392afc27a6583b7e1786cdf579af1
Author: Mike Tutkowski 
Date:   2016-02-04T23:19:57Z

If the volume snapshot is backed by a SolidFire snapshot (as opposed to a 
SolidFire volume), then add it to the list.

commit 454f005ea3701b3ae47b8e0584eab658a331c5c0
Author: Mike Tutkowski 
Date:   2016-02-06T03:58:52Z

Correcting an issue with a rebase

---

Re: [RESULT][VOTE] Apache CloudStack 4.7.0

2016-02-06 Thread Sebastien Goasguen

> On Feb 5, 2016, at 3:08 AM, John Kinsella  wrote:
> 
> Did the announcements for 4.7/4.8 go out? I don’t see them on the mailing 
> lists or elsewhere?
> 


I don’t think it went out, nor do I think there were RN for them or an update 
to the website

>> On Dec 17, 2015, at 8:37 AM, Remi Bergsma  
>> wrote:
>> 
>> Hi all,
>> 
>> After 72 hours, the vote for CloudStack 4.7.0 [1] *passes* with 5 PMC + 1 
>> non-PMC votes.
>> 
>> +1 (PMC / binding)
>> * Wilder
>> * Wido
>> * Milamber
>> * Rohit
>> * Remi
>> 
>> +1 (non binding)
>> * Boris
>> 
>> 0
>> * Abhinandan
>> * Dag
>> * Glenn
>> 
>> -1
>> Raja (has been discussed, seems local test configure issue)
>> 
>> Thanks to everyone participating.
>> 
>> I will now prepare the release announcement to go out after 24 hours to give 
>> the mirrors time to catch up.
>> 
>> [1] http://cloudstack.markmail.org/message/aahz3ajryvd7wzec
>> 
> 



Re: KVM host agent disconnection

2016-02-06 Thread Indra Pramana
Hi Wido and all,

Good day to you.

In addition to my previous email, I noted that the latest released version
of ACS is 4.7. May I know if the problem is resolved by 4.7? I don't
think 4.8 is already available from ACS repository, unless if we get the
source and compile ourselves.

https://cloudstack.apache.org/downloads.html

I also noted that the latest version of ACS 4.7 only supports Ubuntu 14.04,
we are using Ubuntu 12.04 for all our management server and KVM host
agents. Will the latest version of ACS 4.7 work on Ubuntu 12.04?

I found below documentation on how to upgrade from 4.2 to 4.7:

http://docs.cloudstack.apache.org/projects/cloudstack-release-notes/en/4.7.0/upgrade/upgrade-4.2.html

It seems to be quite straight-forward but I noticed that the upgrade
involves installing new system VM templates and restarting of all the
system VMs, which will cause downtime.

Anyone has performed upgrade from ACS version 4.2 to 4.7 before and able to
share your experience and give some advice and/or tips?

Thank you.


On Sat, Feb 6, 2016 at 8:01 PM, Indra Pramana  wrote:

> Hi Wido,
>
> Good day to you, and thanks for your reply. Nice to hear from you again. :)
>
> So is this caused by a bug on 4.2 which is resolved on newer version of
> ACS? Any specific information on the bug, e.g. bug ID and description on
> how to fix it? Is there a way I can resolve the problem without having to
> upgrade?
>
> Is there any documentation I can follow on how to upgrade from 4.2 to 4.8?
> Will this be quite straight-forward or will this involve many steps? We are
> running a production environment and we don't have staging / test
> environment to play with.
>
> Looking forward to your reply, thank you.
>
> Cheers.
>
> On Sat, Feb 6, 2016 at 3:48 PM, Wido den Hollander  wrote:
>
>> Hi,
>>
>> > Op 5 februari 2016 om 17:24 schreef Indra Pramana :
>> >
>> >
>> > Dear all,
>> >
>> > We are using CloudStack 4.2.0, KVM hypervisor and Ceph RBD for primary
>> > storage. In the past one week, many of our KVM host agents would often
>> be
>> > disconnected from the management server, causing the VMs to go down
>> because
>> > of HA work. While we used to have host disconnection in the past,
>> normally
>> > it would only affect just one host, but this time round, when the
>> problem
>> > happens, it would happen on multiple hosts, up to 4-5 hosts at the same
>> > time.
>> >
>>
>> Any reason to still run 4.2? I've seen this happen as well and I haven't
>> seen
>> this with recent versions of ACS.
>>
>> Could you maybe upgrade to 4.8?
>>
>> Wido
>>
>> > Nothing much I can find on both the management-server.log and agent.log,
>> > with no significant warn, error or exceptions logged before the
>> > disconnection. Here are the sample logs from the agent:
>> >
>> > ===
>> > 2016-02-05 03:20:28,820 ERROR [cloud.agent.Agent] (UgentTask-7:null)
>> Ping
>> > Interval has gone past 30.  Attempting to reconnect.
>> > 2016-02-05 03:20:28,825 DEBUG [cloud.agent.Agent] (UgentTask-7:null)
>> > Clearing watch list: 2
>> > 2016-02-05 03:20:28,825 DEBUG [utils.nio.NioConnection]
>> > (Agent-Selector:null) Closing socket
>> > Socket[addr=/*.*.3.3,port=8250,localport=50489]
>> > 2016-02-05 03:20:33,825 INFO  [cloud.agent.Agent] (UgentTask-7:null)
>> Lost
>> > connection to the server. Dealing with the remaining commands...
>> > 2016-02-05 03:20:38,826 INFO  [cloud.agent.Agent] (UgentTask-7:null)
>> > Reconnecting...
>> > 2016-02-05 03:20:38,829 INFO  [utils.nio.NioClient]
>> (Agent-Selector:null)
>> > Connecting to *.*.3.3:8250
>> > 2016-02-05 03:20:38,925 INFO  [utils.nio.NioClient]
>> (Agent-Selector:null)
>> > SSL: Handshake done
>> > 2016-02-05 03:20:38,926 INFO  [utils.nio.NioClient]
>> (Agent-Selector:null)
>> > Connected to *.*.3.3:8250
>> > 2016-02-05 03:20:43,926 INFO  [cloud.agent.Agent] (UgentTask-7:null)
>> > Connected to the server
>> > ===
>> >
>> > Sometimes, the Cloudstack agent will not be able to re-connect unless
>> if we
>> > stop and start the agent again manually:
>> >
>> > ===
>> > 2016-02-05 03:22:20,330 ERROR [cloud.agent.Agent] (UgentTask-6:null)
>> Ping
>> > Interval has gone past 30.  Attempting to reconnect.
>> > 2016-02-05 03:22:20,331 DEBUG [cloud.agent.Agent] (UgentTask-6:null)
>> > Clearing watch list: 2
>> > 2016-02-05 03:22:20,353 DEBUG [utils.nio.NioConnection]
>> > (Agent-Selector:null) Closing socket
>> > Socket[addr=/*.*.3.3,port=8250,localport=46231]
>> > 2016-02-05 03:22:25,332 INFO  [cloud.agent.Agent] (UgentTask-6:null)
>> Lost
>> > connection to the server. Dealing with the remaining commands...
>> > 2016-02-05 03:22:25,332 INFO  [cloud.agent.Agent] (UgentTask-6:null)
>> Cannot
>> > connect because we still have 3 commands in progress.
>> > 2016-02-05 03:22:30,333 INFO  [cloud.agent.Agent] (UgentTask-6:null)
>> Lost
>> > connection to the server. Dealing with the remaining commands...
>> > 2016-02-05 03:22:30,333 INFO  [cloud.agent.Agent] (UgentTask-6:null)
>> Cannot
>> > connect because we still have

[GitHub] cloudstack pull request: Taking fast and efficient volume snapshot...

2016-02-06 Thread mike-tutkowski
Github user mike-tutkowski commented on the pull request:

https://github.com/apache/cloudstack/pull/1403#issuecomment-180848498
  
Here's a copy of my Marvin integration tests (I added a .txt file type so 
that I could upload the file as it's not permitted to upload a file of type 
.py):


[TestSnapshots.py.txt](https://github.com/apache/cloudstack/files/120425/TestSnapshots.py.txt)

Here are the most recent test results:

[results.txt](https://github.com/apache/cloudstack/files/120426/results.txt)




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] cloudstack pull request: CLOUDSTACK-8968: UI icon over VM snapshot...

2016-02-06 Thread rodrigo93
Github user rodrigo93 commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1320#discussion_r52106924
  
--- Diff: ui/scripts/instanceWizard.js ---
@@ -294,53 +295,67 @@
 
 // Step 3: Service offering
 function(args) {
-selectedTemplateObj = null; //reset
-if (args.currentData["select-template"] == 
"select-template") {
-if (featuredTemplateObjs != null && 
featuredTemplateObjs.length > 0) {
-for (var i = 0; i < featuredTemplateObjs.length; 
i++) {
-if (featuredTemplateObjs[i].id == 
args.currentData.templateid) {
-selectedTemplateObj = 
featuredTemplateObjs[i];
-break;
+snapshotObjs = null;
+selectedSnapshotObj = null;
+
+if (args.moreArgs && args.moreArgs.snapshot)
+{
+zoneObjs = args.moreArgs.zone;
+selectedZoneObj = zoneObjs[0];
+hypervisorObjs = args.moreArgs.hypervisor;
+selectedHypervisor = hypervisorObjs[0].name;
+snapshotObjs = args.moreArgs.snapshot;
+selectedSnapshotObj = snapshotObjs[0];
+}
+else {
+selectedTemplateObj = null; //reset
+if (args.currentData["select-template"] == 
"select-template") {
+if (featuredTemplateObjs != null && 
featuredTemplateObjs.length > 0) {
+for (var i = 0; i < 
featuredTemplateObjs.length; i++) {
+if (featuredTemplateObjs[i].id == 
args.currentData.templateid) {
+selectedTemplateObj = 
featuredTemplateObjs[i];
+break;
+}
 }
 }
-}
-if (selectedTemplateObj == null) {
-if (communityTemplateObjs != null && 
communityTemplateObjs.length > 0) {
-for (var i = 0; i < 
communityTemplateObjs.length; i++) {
-if (communityTemplateObjs[i].id == 
args.currentData.templateid) {
-selectedTemplateObj = 
communityTemplateObjs[i];
-break;
+if (selectedTemplateObj == null) {
--- End diff --

Hi @nitin-maharana 
Couldn't this _if_ and the following _ifs_ be merged into one? 
Like:

> if ( selectedTemplateObj == null && communityTemplateObjs != null && 
communityTemplateObjs.length > 0)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] cloudstack pull request: Taking fast and efficient volume snapshot...

2016-02-06 Thread GabrielBrascher
Github user GabrielBrascher commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1403#discussion_r52108568
  
--- Diff: 
engine/storage/datamotion/src/org/apache/cloudstack/storage/motion/StorageSystemDataMotionStrategy.java
 ---
@@ -180,70 +208,119 @@ private Void 
handleCreateTemplateFromSnapshot(SnapshotInfo snapshotInfo, Templat
 throw new CloudRuntimeException("This snapshot is not 
currently in a state where it can be used to create a template.");
 }
 
-HostVO hostVO = getHost(snapshotInfo.getDataStore().getId());
-DataStore srcDataStore = snapshotInfo.getDataStore();
-
-String value = 
_configDao.getValue(Config.PrimaryStorageDownloadWait.toString());
-int primaryStorageDownloadWait = NumbersUtil.parseInt(value, 
Integer.parseInt(Config.PrimaryStorageDownloadWait.getDefaultValue()));
-CopyCommand copyCommand = new CopyCommand(snapshotInfo.getTO(), 
templateInfo.getTO(), primaryStorageDownloadWait, 
VirtualMachineManager.ExecuteInSequence.value());
+HostVO hostVO = getXenServerHost(snapshotInfo);
 
-String errMsg = null;
+boolean usingBackendSnapshot = 
usingBackendSnapshotFor(snapshotInfo);
+boolean computeClusterSupportsResign = 
computeClusterSupportsResign(hostVO.getClusterId());
 
-CopyCmdAnswer copyCmdAnswer = null;
+if (usingBackendSnapshot && !computeClusterSupportsResign) {
+throw new CloudRuntimeException("Unable to locate an 
applicable host with which to perform a resignature operation");
+}
 
 try {
-_volumeService.grantAccess(snapshotInfo, hostVO, srcDataStore);
+if (usingBackendSnapshot) {
+createVolumeFromSnapshot(hostVO, snapshotInfo, true);
+}
 
-Map srcDetails = 
getSnapshotDetails(_storagePoolDao.findById(srcDataStore.getId()), 
snapshotInfo);
+DataStore srcDataStore = snapshotInfo.getDataStore();
 
-copyCommand.setOptions(srcDetails);
+String value = 
_configDao.getValue(Config.PrimaryStorageDownloadWait.toString());
+int primaryStorageDownloadWait = NumbersUtil.parseInt(value, 
Integer.parseInt(Config.PrimaryStorageDownloadWait.getDefaultValue()));
+CopyCommand copyCommand = new 
CopyCommand(snapshotInfo.getTO(), templateInfo.getTO(), 
primaryStorageDownloadWait, VirtualMachineManager.ExecuteInSequence.value());
+
+String errMsg = null;
+
+CopyCmdAnswer copyCmdAnswer = null;
 
-copyCmdAnswer = (CopyCmdAnswer)_agentMgr.send(hostVO.getId(), 
copyCommand);
-}
-catch (Exception ex) {
-throw new CloudRuntimeException(ex.getMessage());
-}
-finally {
 try {
-_volumeService.revokeAccess(snapshotInfo, hostVO, 
srcDataStore);
+// If we are using a back-end snapshot, then we should 
still have access to it from the hosts in the cluster that hostVO is in
+// (because we passed in true as the third parameter to 
createVolumeFromSnapshot above).
+if (usingBackendSnapshot == false) {
+_volumeService.grantAccess(snapshotInfo, hostVO, 
srcDataStore);
+}
+
+Map srcDetails = 
getSnapshotDetails(snapshotInfo);
+
+copyCommand.setOptions(srcDetails);
+
+copyCmdAnswer = 
(CopyCmdAnswer)_agentMgr.send(hostVO.getId(), copyCommand);
 }
 catch (Exception ex) {
-s_logger.debug(ex.getMessage(), ex);
+throw new CloudRuntimeException(ex.getMessage());
 }
-
-if (copyCmdAnswer == null || !copyCmdAnswer.getResult()) {
-if (copyCmdAnswer != null && copyCmdAnswer.getDetails() != 
null && !copyCmdAnswer.getDetails().isEmpty()) {
-errMsg = copyCmdAnswer.getDetails();
+finally {
+try {
+_volumeService.revokeAccess(snapshotInfo, hostVO, 
srcDataStore);
 }
-else {
-errMsg = "Unable to perform host-side operation";
+catch (Exception ex) {
+s_logger.debug(ex.getMessage(), ex);
 }
-}
 
-try {
-if (errMsg == null) {
-snapshotInfo.processEvent(Event.OperationSuccessed);
+if (copyCmdAnswer == null || !copyCmdAnswer.getResult()) {
+if (copyCmdAnswer != null && 
copyCmdAnswer.getDetails() != null && !copyCmdAnswer.g

[GitHub] cloudstack pull request: Taking fast and efficient volume snapshot...

2016-02-06 Thread GabrielBrascher
Github user GabrielBrascher commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1403#discussion_r52108693
  
--- Diff: 
engine/storage/datamotion/src/org/apache/cloudstack/storage/motion/StorageSystemDataMotionStrategy.java
 ---
@@ -255,99 +332,149 @@ private Void 
handleCreateVolumeFromSnapshotBothOnStorageSystem(SnapshotInfo snap
 
 VolumeApiResult result = future.get();
 
+if (volumeDetail != null) {
+_volumeDetailsDao.remove(volumeDetail.getId());
+}
+
 if (result.isFailed()) {
 s_logger.debug("Failed to create a volume: " + 
result.getResult());
 
 throw new CloudRuntimeException(result.getResult());
 }
-}
-catch (Exception ex) {
-throw new CloudRuntimeException(ex.getMessage());
-}
-
-volumeInfo = _volumeDataFactory.getVolume(volumeInfo.getId(), 
volumeInfo.getDataStore());
 
-volumeInfo.processEvent(Event.MigrationRequested);
+volumeInfo = _volumeDataFactory.getVolume(volumeInfo.getId(), 
volumeInfo.getDataStore());
 
-volumeInfo = _volumeDataFactory.getVolume(volumeInfo.getId(), 
volumeInfo.getDataStore());
+volumeInfo.processEvent(Event.MigrationRequested);
 
-HostVO hostVO = getHost(snapshotInfo.getDataStore().getId());
+volumeInfo = _volumeDataFactory.getVolume(volumeInfo.getId(), 
volumeInfo.getDataStore());
 
-String value = 
_configDao.getValue(Config.PrimaryStorageDownloadWait.toString());
-int primaryStorageDownloadWait = NumbersUtil.parseInt(value, 
Integer.parseInt(Config.PrimaryStorageDownloadWait.getDefaultValue()));
-CopyCommand copyCommand = new CopyCommand(snapshotInfo.getTO(), 
volumeInfo.getTO(), primaryStorageDownloadWait, 
VirtualMachineManager.ExecuteInSequence.value());
+if (useCloning) {
+copyCmdAnswer = performResignature(volumeInfo, hostVO);
+}
+else {
+// asking for a XenServer host here so we don't always 
prefer to use XenServer hosts that support resigning
+// even when we don't need those hosts to do this kind of 
copy work
+hostVO = getXenServerHost(snapshotInfo.getDataCenterId(), 
false);
 
-CopyCmdAnswer copyCmdAnswer = null;
+copyCmdAnswer = performCopyOfVdi(volumeInfo, snapshotInfo, 
hostVO);
+}
 
-try {
-_volumeService.grantAccess(snapshotInfo, hostVO, 
snapshotInfo.getDataStore());
-_volumeService.grantAccess(volumeInfo, hostVO, 
volumeInfo.getDataStore());
+if (copyCmdAnswer == null || !copyCmdAnswer.getResult()) {
+if (copyCmdAnswer != null && copyCmdAnswer.getDetails() != 
null && !copyCmdAnswer.getDetails().isEmpty()) {
+errMsg = copyCmdAnswer.getDetails();
+}
+else {
+errMsg = "Unable to perform host-side operation";
+}
+}
+}
+catch (Exception ex) {
+errMsg = ex.getMessage() != null ? ex.getMessage() : "Copy 
operation failed";
+}
 
-Map srcDetails = 
getSnapshotDetails(_storagePoolDao.findById(snapshotInfo.getDataStore().getId()),
 snapshotInfo);
+CopyCommandResult result = new CopyCommandResult(null, 
copyCmdAnswer);
 
-copyCommand.setOptions(srcDetails);
+result.setResult(errMsg);
 
-Map destDetails = getVolumeDetails(volumeInfo);
+callback.complete(result);
+}
 
-copyCommand.setOptions2(destDetails);
+// If the underlying storage system is making use of read-only 
snapshots, this gives the storage system the opportunity to
+// create a volume from the snapshot so that we can copy the VHD file 
that should be inside of the snapshot to secondary storage.
+//
+// The resultant volume must be writable because we need to resign the 
SR and the VDI that should be inside of it before we copy
+// the VHD file to secondary storage.
+//
+// If the storage system is using writable snapshots, then nothing 
need be done by that storage system here because we can just
+// resign the SR and the VDI that should be inside of the snapshot 
before copying the VHD file to secondary storage.
+private void createVolumeFromSnapshot(HostVO hostVO, SnapshotInfo 
snapshotInfo, boolean keepGrantedAccess) {
--- End diff --

@mike-tutkowski Could you please use Javadoc (`/**  */`) 
instead of comments (`// `)?
As you already wrote a good documenta

[GitHub] cloudstack pull request: Taking fast and efficient volume snapshot...

2016-02-06 Thread GabrielBrascher
Github user GabrielBrascher commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1403#discussion_r52108776
  
--- Diff: 
engine/storage/datamotion/src/org/apache/cloudstack/storage/motion/StorageSystemDataMotionStrategy.java
 ---
@@ -255,99 +332,149 @@ private Void 
handleCreateVolumeFromSnapshotBothOnStorageSystem(SnapshotInfo snap
 
 VolumeApiResult result = future.get();
 
+if (volumeDetail != null) {
+_volumeDetailsDao.remove(volumeDetail.getId());
+}
+
 if (result.isFailed()) {
 s_logger.debug("Failed to create a volume: " + 
result.getResult());
 
 throw new CloudRuntimeException(result.getResult());
 }
-}
-catch (Exception ex) {
-throw new CloudRuntimeException(ex.getMessage());
-}
-
-volumeInfo = _volumeDataFactory.getVolume(volumeInfo.getId(), 
volumeInfo.getDataStore());
 
-volumeInfo.processEvent(Event.MigrationRequested);
+volumeInfo = _volumeDataFactory.getVolume(volumeInfo.getId(), 
volumeInfo.getDataStore());
 
-volumeInfo = _volumeDataFactory.getVolume(volumeInfo.getId(), 
volumeInfo.getDataStore());
+volumeInfo.processEvent(Event.MigrationRequested);
 
-HostVO hostVO = getHost(snapshotInfo.getDataStore().getId());
+volumeInfo = _volumeDataFactory.getVolume(volumeInfo.getId(), 
volumeInfo.getDataStore());
 
-String value = 
_configDao.getValue(Config.PrimaryStorageDownloadWait.toString());
-int primaryStorageDownloadWait = NumbersUtil.parseInt(value, 
Integer.parseInt(Config.PrimaryStorageDownloadWait.getDefaultValue()));
-CopyCommand copyCommand = new CopyCommand(snapshotInfo.getTO(), 
volumeInfo.getTO(), primaryStorageDownloadWait, 
VirtualMachineManager.ExecuteInSequence.value());
+if (useCloning) {
+copyCmdAnswer = performResignature(volumeInfo, hostVO);
+}
+else {
+// asking for a XenServer host here so we don't always 
prefer to use XenServer hosts that support resigning
+// even when we don't need those hosts to do this kind of 
copy work
+hostVO = getXenServerHost(snapshotInfo.getDataCenterId(), 
false);
 
-CopyCmdAnswer copyCmdAnswer = null;
+copyCmdAnswer = performCopyOfVdi(volumeInfo, snapshotInfo, 
hostVO);
+}
 
-try {
-_volumeService.grantAccess(snapshotInfo, hostVO, 
snapshotInfo.getDataStore());
-_volumeService.grantAccess(volumeInfo, hostVO, 
volumeInfo.getDataStore());
+if (copyCmdAnswer == null || !copyCmdAnswer.getResult()) {
+if (copyCmdAnswer != null && copyCmdAnswer.getDetails() != 
null && !copyCmdAnswer.getDetails().isEmpty()) {
+errMsg = copyCmdAnswer.getDetails();
+}
+else {
+errMsg = "Unable to perform host-side operation";
+}
+}
+}
+catch (Exception ex) {
+errMsg = ex.getMessage() != null ? ex.getMessage() : "Copy 
operation failed";
+}
 
-Map srcDetails = 
getSnapshotDetails(_storagePoolDao.findById(snapshotInfo.getDataStore().getId()),
 snapshotInfo);
+CopyCommandResult result = new CopyCommandResult(null, 
copyCmdAnswer);
 
-copyCommand.setOptions(srcDetails);
+result.setResult(errMsg);
 
-Map destDetails = getVolumeDetails(volumeInfo);
+callback.complete(result);
+}
 
-copyCommand.setOptions2(destDetails);
+// If the underlying storage system is making use of read-only 
snapshots, this gives the storage system the opportunity to
+// create a volume from the snapshot so that we can copy the VHD file 
that should be inside of the snapshot to secondary storage.
+//
+// The resultant volume must be writable because we need to resign the 
SR and the VDI that should be inside of it before we copy
+// the VHD file to secondary storage.
+//
+// If the storage system is using writable snapshots, then nothing 
need be done by that storage system here because we can just
+// resign the SR and the VDI that should be inside of the snapshot 
before copying the VHD file to secondary storage.
+private void createVolumeFromSnapshot(HostVO hostVO, SnapshotInfo 
snapshotInfo, boolean keepGrantedAccess) {
+SnapshotDetailsVO snapshotDetails = 
handleSnapshotDetails(snapshotInfo.getId(), "tempVolume", "create");
 
-copyCmdAnswer =

[GitHub] cloudstack pull request: Taking fast and efficient volume snapshot...

2016-02-06 Thread GabrielBrascher
Github user GabrielBrascher commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/1403#discussion_r52108928
  
--- Diff: 
plugins/hypervisors/xenserver/src/com/cloud/hypervisor/xenserver/resource/CitrixResourceBase.java
 ---
@@ -168,7 +168,9 @@
 public abstract class CitrixResourceBase implements ServerResource, 
HypervisorResource, VirtualRouterDeployer {
 
 public enum SRType {
-EXT, FILE, ISCSI, ISO, LVM, LVMOHBA, LVMOISCSI, NFS;
+// RELVMOISCSI = used for resigning metadata (like SR UUID and VDI 
UUID when a
+// particular storage manager is installed on a XenServer host 
(for back-end snapshots to work))
+EXT, FILE, ISCSI, ISO, LVM, LVMOHBA, LVMOISCSI, RELVMOISCSI, NFS;
--- End diff --

@mike-tutkowski Sorry if I am being too repetitive.

As an idea, this comments might serve well as a Javadoc block documenting 
the enum class. If you are friendly with this idea, the same could be done with 
the 
**org.apache.cloudstack.engine.subsystem.api.storage.DataStoreCapabilities** 
enum class.

Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] cloudstack pull request: CLOUDSTACK-9120 READ.ME files describing ...

2016-02-06 Thread GabrielBrascher
Github user GabrielBrascher commented on the pull request:

https://github.com/apache/cloudstack/pull/1202#issuecomment-180890324
  
LGTM. Based on the lack of code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] cloudstack pull request: Bug-ID: CLOUDSTACK-8870: Skip external de...

2016-02-06 Thread GabrielBrascher
Github user GabrielBrascher commented on a diff in the pull request:

https://github.com/apache/cloudstack/pull/846#discussion_r52110260
  
--- Diff: server/src/com/cloud/network/ExternalDeviceUsageManagerImpl.java 
---
@@ -342,6 +342,12 @@ public ExternalDeviceNetworkUsageTask() {
 
 @Override
 protected void runInContext() {
+//Check if there are any external deivces
+//Skip external device usage collection if none exist
--- End diff --

@kishankavala Could you please change from "deivces" to "devices"?

Also, you can use this commented lines as a Javadoc block describing the 
runInContext() method (if you think it would improve your code).

Except that typo, your code seems ok.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


best practices in usage server

2016-02-06 Thread Alireza Eskandari
Hi,
I have a 2 node cluster of CS with a separated MariaDB cluster based on Galera 
as database.
What is your recommendation in installing usage server of CS?
Should I install it on a separated server or on CS nodes?
Is it necessary to install in on both CS nodes?
Regards