Re: KVM host agent disconnection
Hi Wido, Good day to you, and thanks for your reply. Nice to hear from you again. :) So is this caused by a bug on 4.2 which is resolved on newer version of ACS? Any specific information on the bug, e.g. bug ID and description on how to fix it? Is there a way I can resolve the problem without having to upgrade? Is there any documentation I can follow on how to upgrade from 4.2 to 4.8? Will this be quite straight-forward or will this involve many steps? We are running a production environment and we don't have staging / test environment to play with. Looking forward to your reply, thank you. Cheers. On Sat, Feb 6, 2016 at 3:48 PM, Wido den Hollander wrote: > Hi, > > > Op 5 februari 2016 om 17:24 schreef Indra Pramana : > > > > > > Dear all, > > > > We are using CloudStack 4.2.0, KVM hypervisor and Ceph RBD for primary > > storage. In the past one week, many of our KVM host agents would often be > > disconnected from the management server, causing the VMs to go down > because > > of HA work. While we used to have host disconnection in the past, > normally > > it would only affect just one host, but this time round, when the problem > > happens, it would happen on multiple hosts, up to 4-5 hosts at the same > > time. > > > > Any reason to still run 4.2? I've seen this happen as well and I haven't > seen > this with recent versions of ACS. > > Could you maybe upgrade to 4.8? > > Wido > > > Nothing much I can find on both the management-server.log and agent.log, > > with no significant warn, error or exceptions logged before the > > disconnection. Here are the sample logs from the agent: > > > > === > > 2016-02-05 03:20:28,820 ERROR [cloud.agent.Agent] (UgentTask-7:null) Ping > > Interval has gone past 30. Attempting to reconnect. > > 2016-02-05 03:20:28,825 DEBUG [cloud.agent.Agent] (UgentTask-7:null) > > Clearing watch list: 2 > > 2016-02-05 03:20:28,825 DEBUG [utils.nio.NioConnection] > > (Agent-Selector:null) Closing socket > > Socket[addr=/*.*.3.3,port=8250,localport=50489] > > 2016-02-05 03:20:33,825 INFO [cloud.agent.Agent] (UgentTask-7:null) Lost > > connection to the server. Dealing with the remaining commands... > > 2016-02-05 03:20:38,826 INFO [cloud.agent.Agent] (UgentTask-7:null) > > Reconnecting... > > 2016-02-05 03:20:38,829 INFO [utils.nio.NioClient] (Agent-Selector:null) > > Connecting to *.*.3.3:8250 > > 2016-02-05 03:20:38,925 INFO [utils.nio.NioClient] (Agent-Selector:null) > > SSL: Handshake done > > 2016-02-05 03:20:38,926 INFO [utils.nio.NioClient] (Agent-Selector:null) > > Connected to *.*.3.3:8250 > > 2016-02-05 03:20:43,926 INFO [cloud.agent.Agent] (UgentTask-7:null) > > Connected to the server > > === > > > > Sometimes, the Cloudstack agent will not be able to re-connect unless if > we > > stop and start the agent again manually: > > > > === > > 2016-02-05 03:22:20,330 ERROR [cloud.agent.Agent] (UgentTask-6:null) Ping > > Interval has gone past 30. Attempting to reconnect. > > 2016-02-05 03:22:20,331 DEBUG [cloud.agent.Agent] (UgentTask-6:null) > > Clearing watch list: 2 > > 2016-02-05 03:22:20,353 DEBUG [utils.nio.NioConnection] > > (Agent-Selector:null) Closing socket > > Socket[addr=/*.*.3.3,port=8250,localport=46231] > > 2016-02-05 03:22:25,332 INFO [cloud.agent.Agent] (UgentTask-6:null) Lost > > connection to the server. Dealing with the remaining commands... > > 2016-02-05 03:22:25,332 INFO [cloud.agent.Agent] (UgentTask-6:null) > Cannot > > connect because we still have 3 commands in progress. > > 2016-02-05 03:22:30,333 INFO [cloud.agent.Agent] (UgentTask-6:null) Lost > > connection to the server. Dealing with the remaining commands... > > 2016-02-05 03:22:30,333 INFO [cloud.agent.Agent] (UgentTask-6:null) > Cannot > > connect because we still have 3 commands in progress. > > 2016-02-05 03:22:35,333 INFO [cloud.agent.Agent] (UgentTask-6:null) Lost > > connection to the server. Dealing with the remaining commands... > > 2016-02-05 03:22:35,334 INFO [cloud.agent.Agent] (UgentTask-6:null) > Cannot > > connect because we still have 3 commands in progress. > > 2016-02-05 03:22:40,334 INFO [cloud.agent.Agent] (UgentTask-6:null) Lost > > connection to the server. Dealing with the remaining commands... > > 2016-02-05 03:22:40,335 INFO [cloud.agent.Agent] (UgentTask-6:null) > Cannot > > connect because we still have 3 commands in progress. > > 2016-02-05 03:22:45,335 INFO [cloud.agent.Agent] (UgentTask-6:null) Lost > > connection to the server. Dealing with the remaining commands... > > 2016-02-05 03:22:45,335 INFO [cloud.agent.Agent] (UgentTask-6:null) > Cannot > > connect because we still have 3 commands in progress. > > 2016-02-05 03:22:50,336 INFO [cloud.agent.Agent] (UgentTask-6:null) Lost > > connection to the server. Dealing with the remaining commands... > > 2016-02-05 03:22:50,336 INFO [cloud.agent.Agent] (UgentTask-6:null) > Cannot > > connect because we still have 3 commands in progress. > > 2016-02-05 03:22:55,337 INFO [cloud.agent.Agent
[GitHub] cloudstack pull request: Check the existence of 'forceencap' param...
Github user bhaisaab commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1402#discussion_r52102132 --- Diff: systemvm/patches/debian/config/opt/cloud/bin/configure.py --- @@ -531,6 +531,8 @@ def configure_ipsec(self, obj): file.addeq(" pfs=%s" % CsHelper.bool_to_yn(obj['dpd'])) file.addeq(" keyingtries=2") file.addeq(" auto=start") +if not obj.has_key('encap'): --- End diff -- Consider using: if 'encap' not in obj, instead of has_key(). More pythonic that way ;) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] cloudstack pull request: Taking fast and efficient volume snapshot...
GitHub user mike-tutkowski opened a pull request: https://github.com/apache/cloudstack/pull/1403 Taking fast and efficient volume snapshots with XenServer (and your storage provider) A XenServer storage repository (SR) and virtual disk image (VDI) each have UUIDs that are immutable. This poses a problem for SAN snapshots, if you intend on mounting the underlying snapshot SR alongside the source SR (duplicate UUIDs). VMware has a solution for this called re-signaturing (so, in other words, the snapshot UUIDs can be changed). This PR only deals with the CloudStack side of things, but it works in concert with a new XenServer storage manager created by CloudOps (this storage manager enables re-signaturing of XenServer SR and VDI UUIDs). I have written Marvin integration tests to go along with this, but cannot yet check those into the CloudStack repo as they rely on SolidFire hardware. If anyone would like to see these integration tests, please let me know. You can merge this pull request into a Git repository by running: $ git pull https://github.com/mike-tutkowski/cloudstack xs-snapshots Alternatively you can review and apply these changes as the patch at: https://github.com/apache/cloudstack/pull/1403.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1403 commit 34dc4d7ca9cdc66fc16301d0f8a8ffef790e7462 Author: Mike Tutkowski Date: 2015-11-16T19:18:25Z Support for backend snapshots with XenServer commit c0737879d7e0a5a8e991ab1f97c16db10fc79132 Author: Mike Tutkowski Date: 2016-01-04T05:36:52Z Initial changes to make use of both SolidFire snapshots and SolidFire cloning commit bae4c2780d30194da6149b98248e6fcd0e4faa84 Author: Mike Tutkowski Date: 2015-11-02T23:58:09Z Refactoring and enhancements to SolidFire Integration-Testing API Plug-in commit 8c551edca3fa7ebdc86e69fd747bb8a14d4ed178 Author: Mike Tutkowski Date: 2016-01-05T08:32:39Z Only "permanently" make use of cloning a volume from a snapshot when creating a new CloudStack volume from a volume snapshot commit db3b66dfd6b172d7dd1a731abe200ef563f1439d Author: Mike Tutkowski Date: 2016-01-07T19:13:53Z Access-related optimization for creating a template from a snapshot commit a0a8da3b0044d1b388a8a0aa9e8f492b067eb807 Author: Mike Tutkowski Date: 2016-01-07T19:16:48Z Correction to used-space calculation commit 054f110d47543c966a83cf797fea9a0d2046e6af Author: Mike Tutkowski Date: 2016-01-08T01:27:48Z Do not check for remaining volume snapshots before deleting a volume snapshot that is supported by a back-end volume (only do this when deleting a volume snapshot that is supported by a back-end snapshot) commit d8cf010117f664854c82b68a2d4a0d9b4b1e7c25 Author: Mike Tutkowski Date: 2016-01-08T02:27:57Z When deleting a CloudStack volume that is tied to a SolidFire volume, only delete the SolidFire volume if the CloudStack volume's volume snapshots (if any) are all supported by SolidFire volumes (as opposed to any being supported by SolidFire snapshots) commit 8655fb1aa3a1cc6ee5443607836a74b103814b02 Author: Mike Tutkowski Date: 2016-01-08T20:22:43Z "=" should be "==" commit 836c9a5b8ae6b2b8175166ba1418b9f59314cb4b Author: Mike Tutkowski Date: 2016-01-13T19:26:25Z For integration-test purposes: Get snapshot details commit 9528ea8d23ac41b7f5d9735ec29572781ee16e27 Author: Mike Tutkowski Date: 2016-01-20T02:07:42Z Enabling support for arbitrary key/value pairs to more easily be stored for SolidFire volumes commit acf15ed6af5ff7090b2b71bef9f70a72f87cab48 Author: Mike Tutkowski Date: 2016-01-20T22:57:32Z Enable the use of arbitrary key/value pairs when creating a SolidFire volume via cloning and when creating a SolidFire snapshot commit 1c9516ab27923caa845bf99a3a2eab406a9d7a6f Author: Mike Tutkowski Date: 2016-01-20T23:57:12Z Improved exception handling commit 7625e188d264633beaf30b1ca04779c1890d02f6 Author: Mike Tutkowski Date: 2016-01-22T18:43:18Z The way resigning metadata is invoked has changed. Call SR.create with type RELVMOISCSI. An exception should be thrown when the time would otherwise have come for the create functionality to attach the SR. Check if "success" is returned. If so, invoke SR.introduce; else, re-throw the exception. commit a0fdf10246aa0a6f123b8583862322d30ecf0f38 Author: Mike Tutkowski Date: 2016-01-29T19:32:26Z Check if hostVO == null (not if hostId == null) commit e851a40caf1392afc27a6583b7e1786cdf579af1 Author: Mike Tutkowski Date: 2016-02-04T23:19:57Z If the volume snapshot is backed by a SolidFire snapshot (as opposed to a SolidFire volume), then add it to the list. commit 454f005ea3701b3ae47b8e0584eab658a331c5c0 Author: Mike Tutkowski Date: 2016-02-06T03:58:52Z Correcting an issue with a rebase ---
Re: [RESULT][VOTE] Apache CloudStack 4.7.0
> On Feb 5, 2016, at 3:08 AM, John Kinsella wrote: > > Did the announcements for 4.7/4.8 go out? I don’t see them on the mailing > lists or elsewhere? > I don’t think it went out, nor do I think there were RN for them or an update to the website >> On Dec 17, 2015, at 8:37 AM, Remi Bergsma >> wrote: >> >> Hi all, >> >> After 72 hours, the vote for CloudStack 4.7.0 [1] *passes* with 5 PMC + 1 >> non-PMC votes. >> >> +1 (PMC / binding) >> * Wilder >> * Wido >> * Milamber >> * Rohit >> * Remi >> >> +1 (non binding) >> * Boris >> >> 0 >> * Abhinandan >> * Dag >> * Glenn >> >> -1 >> Raja (has been discussed, seems local test configure issue) >> >> Thanks to everyone participating. >> >> I will now prepare the release announcement to go out after 24 hours to give >> the mirrors time to catch up. >> >> [1] http://cloudstack.markmail.org/message/aahz3ajryvd7wzec >> >
Re: KVM host agent disconnection
Hi Wido and all, Good day to you. In addition to my previous email, I noted that the latest released version of ACS is 4.7. May I know if the problem is resolved by 4.7? I don't think 4.8 is already available from ACS repository, unless if we get the source and compile ourselves. https://cloudstack.apache.org/downloads.html I also noted that the latest version of ACS 4.7 only supports Ubuntu 14.04, we are using Ubuntu 12.04 for all our management server and KVM host agents. Will the latest version of ACS 4.7 work on Ubuntu 12.04? I found below documentation on how to upgrade from 4.2 to 4.7: http://docs.cloudstack.apache.org/projects/cloudstack-release-notes/en/4.7.0/upgrade/upgrade-4.2.html It seems to be quite straight-forward but I noticed that the upgrade involves installing new system VM templates and restarting of all the system VMs, which will cause downtime. Anyone has performed upgrade from ACS version 4.2 to 4.7 before and able to share your experience and give some advice and/or tips? Thank you. On Sat, Feb 6, 2016 at 8:01 PM, Indra Pramana wrote: > Hi Wido, > > Good day to you, and thanks for your reply. Nice to hear from you again. :) > > So is this caused by a bug on 4.2 which is resolved on newer version of > ACS? Any specific information on the bug, e.g. bug ID and description on > how to fix it? Is there a way I can resolve the problem without having to > upgrade? > > Is there any documentation I can follow on how to upgrade from 4.2 to 4.8? > Will this be quite straight-forward or will this involve many steps? We are > running a production environment and we don't have staging / test > environment to play with. > > Looking forward to your reply, thank you. > > Cheers. > > On Sat, Feb 6, 2016 at 3:48 PM, Wido den Hollander wrote: > >> Hi, >> >> > Op 5 februari 2016 om 17:24 schreef Indra Pramana : >> > >> > >> > Dear all, >> > >> > We are using CloudStack 4.2.0, KVM hypervisor and Ceph RBD for primary >> > storage. In the past one week, many of our KVM host agents would often >> be >> > disconnected from the management server, causing the VMs to go down >> because >> > of HA work. While we used to have host disconnection in the past, >> normally >> > it would only affect just one host, but this time round, when the >> problem >> > happens, it would happen on multiple hosts, up to 4-5 hosts at the same >> > time. >> > >> >> Any reason to still run 4.2? I've seen this happen as well and I haven't >> seen >> this with recent versions of ACS. >> >> Could you maybe upgrade to 4.8? >> >> Wido >> >> > Nothing much I can find on both the management-server.log and agent.log, >> > with no significant warn, error or exceptions logged before the >> > disconnection. Here are the sample logs from the agent: >> > >> > === >> > 2016-02-05 03:20:28,820 ERROR [cloud.agent.Agent] (UgentTask-7:null) >> Ping >> > Interval has gone past 30. Attempting to reconnect. >> > 2016-02-05 03:20:28,825 DEBUG [cloud.agent.Agent] (UgentTask-7:null) >> > Clearing watch list: 2 >> > 2016-02-05 03:20:28,825 DEBUG [utils.nio.NioConnection] >> > (Agent-Selector:null) Closing socket >> > Socket[addr=/*.*.3.3,port=8250,localport=50489] >> > 2016-02-05 03:20:33,825 INFO [cloud.agent.Agent] (UgentTask-7:null) >> Lost >> > connection to the server. Dealing with the remaining commands... >> > 2016-02-05 03:20:38,826 INFO [cloud.agent.Agent] (UgentTask-7:null) >> > Reconnecting... >> > 2016-02-05 03:20:38,829 INFO [utils.nio.NioClient] >> (Agent-Selector:null) >> > Connecting to *.*.3.3:8250 >> > 2016-02-05 03:20:38,925 INFO [utils.nio.NioClient] >> (Agent-Selector:null) >> > SSL: Handshake done >> > 2016-02-05 03:20:38,926 INFO [utils.nio.NioClient] >> (Agent-Selector:null) >> > Connected to *.*.3.3:8250 >> > 2016-02-05 03:20:43,926 INFO [cloud.agent.Agent] (UgentTask-7:null) >> > Connected to the server >> > === >> > >> > Sometimes, the Cloudstack agent will not be able to re-connect unless >> if we >> > stop and start the agent again manually: >> > >> > === >> > 2016-02-05 03:22:20,330 ERROR [cloud.agent.Agent] (UgentTask-6:null) >> Ping >> > Interval has gone past 30. Attempting to reconnect. >> > 2016-02-05 03:22:20,331 DEBUG [cloud.agent.Agent] (UgentTask-6:null) >> > Clearing watch list: 2 >> > 2016-02-05 03:22:20,353 DEBUG [utils.nio.NioConnection] >> > (Agent-Selector:null) Closing socket >> > Socket[addr=/*.*.3.3,port=8250,localport=46231] >> > 2016-02-05 03:22:25,332 INFO [cloud.agent.Agent] (UgentTask-6:null) >> Lost >> > connection to the server. Dealing with the remaining commands... >> > 2016-02-05 03:22:25,332 INFO [cloud.agent.Agent] (UgentTask-6:null) >> Cannot >> > connect because we still have 3 commands in progress. >> > 2016-02-05 03:22:30,333 INFO [cloud.agent.Agent] (UgentTask-6:null) >> Lost >> > connection to the server. Dealing with the remaining commands... >> > 2016-02-05 03:22:30,333 INFO [cloud.agent.Agent] (UgentTask-6:null) >> Cannot >> > connect because we still have
[GitHub] cloudstack pull request: Taking fast and efficient volume snapshot...
Github user mike-tutkowski commented on the pull request: https://github.com/apache/cloudstack/pull/1403#issuecomment-180848498 Here's a copy of my Marvin integration tests (I added a .txt file type so that I could upload the file as it's not permitted to upload a file of type .py): [TestSnapshots.py.txt](https://github.com/apache/cloudstack/files/120425/TestSnapshots.py.txt) Here are the most recent test results: [results.txt](https://github.com/apache/cloudstack/files/120426/results.txt) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] cloudstack pull request: CLOUDSTACK-8968: UI icon over VM snapshot...
Github user rodrigo93 commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1320#discussion_r52106924 --- Diff: ui/scripts/instanceWizard.js --- @@ -294,53 +295,67 @@ // Step 3: Service offering function(args) { -selectedTemplateObj = null; //reset -if (args.currentData["select-template"] == "select-template") { -if (featuredTemplateObjs != null && featuredTemplateObjs.length > 0) { -for (var i = 0; i < featuredTemplateObjs.length; i++) { -if (featuredTemplateObjs[i].id == args.currentData.templateid) { -selectedTemplateObj = featuredTemplateObjs[i]; -break; +snapshotObjs = null; +selectedSnapshotObj = null; + +if (args.moreArgs && args.moreArgs.snapshot) +{ +zoneObjs = args.moreArgs.zone; +selectedZoneObj = zoneObjs[0]; +hypervisorObjs = args.moreArgs.hypervisor; +selectedHypervisor = hypervisorObjs[0].name; +snapshotObjs = args.moreArgs.snapshot; +selectedSnapshotObj = snapshotObjs[0]; +} +else { +selectedTemplateObj = null; //reset +if (args.currentData["select-template"] == "select-template") { +if (featuredTemplateObjs != null && featuredTemplateObjs.length > 0) { +for (var i = 0; i < featuredTemplateObjs.length; i++) { +if (featuredTemplateObjs[i].id == args.currentData.templateid) { +selectedTemplateObj = featuredTemplateObjs[i]; +break; +} } } -} -if (selectedTemplateObj == null) { -if (communityTemplateObjs != null && communityTemplateObjs.length > 0) { -for (var i = 0; i < communityTemplateObjs.length; i++) { -if (communityTemplateObjs[i].id == args.currentData.templateid) { -selectedTemplateObj = communityTemplateObjs[i]; -break; +if (selectedTemplateObj == null) { --- End diff -- Hi @nitin-maharana Couldn't this _if_ and the following _ifs_ be merged into one? Like: > if ( selectedTemplateObj == null && communityTemplateObjs != null && communityTemplateObjs.length > 0) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] cloudstack pull request: Taking fast and efficient volume snapshot...
Github user GabrielBrascher commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1403#discussion_r52108568 --- Diff: engine/storage/datamotion/src/org/apache/cloudstack/storage/motion/StorageSystemDataMotionStrategy.java --- @@ -180,70 +208,119 @@ private Void handleCreateTemplateFromSnapshot(SnapshotInfo snapshotInfo, Templat throw new CloudRuntimeException("This snapshot is not currently in a state where it can be used to create a template."); } -HostVO hostVO = getHost(snapshotInfo.getDataStore().getId()); -DataStore srcDataStore = snapshotInfo.getDataStore(); - -String value = _configDao.getValue(Config.PrimaryStorageDownloadWait.toString()); -int primaryStorageDownloadWait = NumbersUtil.parseInt(value, Integer.parseInt(Config.PrimaryStorageDownloadWait.getDefaultValue())); -CopyCommand copyCommand = new CopyCommand(snapshotInfo.getTO(), templateInfo.getTO(), primaryStorageDownloadWait, VirtualMachineManager.ExecuteInSequence.value()); +HostVO hostVO = getXenServerHost(snapshotInfo); -String errMsg = null; +boolean usingBackendSnapshot = usingBackendSnapshotFor(snapshotInfo); +boolean computeClusterSupportsResign = computeClusterSupportsResign(hostVO.getClusterId()); -CopyCmdAnswer copyCmdAnswer = null; +if (usingBackendSnapshot && !computeClusterSupportsResign) { +throw new CloudRuntimeException("Unable to locate an applicable host with which to perform a resignature operation"); +} try { -_volumeService.grantAccess(snapshotInfo, hostVO, srcDataStore); +if (usingBackendSnapshot) { +createVolumeFromSnapshot(hostVO, snapshotInfo, true); +} -Map srcDetails = getSnapshotDetails(_storagePoolDao.findById(srcDataStore.getId()), snapshotInfo); +DataStore srcDataStore = snapshotInfo.getDataStore(); -copyCommand.setOptions(srcDetails); +String value = _configDao.getValue(Config.PrimaryStorageDownloadWait.toString()); +int primaryStorageDownloadWait = NumbersUtil.parseInt(value, Integer.parseInt(Config.PrimaryStorageDownloadWait.getDefaultValue())); +CopyCommand copyCommand = new CopyCommand(snapshotInfo.getTO(), templateInfo.getTO(), primaryStorageDownloadWait, VirtualMachineManager.ExecuteInSequence.value()); + +String errMsg = null; + +CopyCmdAnswer copyCmdAnswer = null; -copyCmdAnswer = (CopyCmdAnswer)_agentMgr.send(hostVO.getId(), copyCommand); -} -catch (Exception ex) { -throw new CloudRuntimeException(ex.getMessage()); -} -finally { try { -_volumeService.revokeAccess(snapshotInfo, hostVO, srcDataStore); +// If we are using a back-end snapshot, then we should still have access to it from the hosts in the cluster that hostVO is in +// (because we passed in true as the third parameter to createVolumeFromSnapshot above). +if (usingBackendSnapshot == false) { +_volumeService.grantAccess(snapshotInfo, hostVO, srcDataStore); +} + +Map srcDetails = getSnapshotDetails(snapshotInfo); + +copyCommand.setOptions(srcDetails); + +copyCmdAnswer = (CopyCmdAnswer)_agentMgr.send(hostVO.getId(), copyCommand); } catch (Exception ex) { -s_logger.debug(ex.getMessage(), ex); +throw new CloudRuntimeException(ex.getMessage()); } - -if (copyCmdAnswer == null || !copyCmdAnswer.getResult()) { -if (copyCmdAnswer != null && copyCmdAnswer.getDetails() != null && !copyCmdAnswer.getDetails().isEmpty()) { -errMsg = copyCmdAnswer.getDetails(); +finally { +try { +_volumeService.revokeAccess(snapshotInfo, hostVO, srcDataStore); } -else { -errMsg = "Unable to perform host-side operation"; +catch (Exception ex) { +s_logger.debug(ex.getMessage(), ex); } -} -try { -if (errMsg == null) { -snapshotInfo.processEvent(Event.OperationSuccessed); +if (copyCmdAnswer == null || !copyCmdAnswer.getResult()) { +if (copyCmdAnswer != null && copyCmdAnswer.getDetails() != null && !copyCmdAnswer.g
[GitHub] cloudstack pull request: Taking fast and efficient volume snapshot...
Github user GabrielBrascher commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1403#discussion_r52108693 --- Diff: engine/storage/datamotion/src/org/apache/cloudstack/storage/motion/StorageSystemDataMotionStrategy.java --- @@ -255,99 +332,149 @@ private Void handleCreateVolumeFromSnapshotBothOnStorageSystem(SnapshotInfo snap VolumeApiResult result = future.get(); +if (volumeDetail != null) { +_volumeDetailsDao.remove(volumeDetail.getId()); +} + if (result.isFailed()) { s_logger.debug("Failed to create a volume: " + result.getResult()); throw new CloudRuntimeException(result.getResult()); } -} -catch (Exception ex) { -throw new CloudRuntimeException(ex.getMessage()); -} - -volumeInfo = _volumeDataFactory.getVolume(volumeInfo.getId(), volumeInfo.getDataStore()); -volumeInfo.processEvent(Event.MigrationRequested); +volumeInfo = _volumeDataFactory.getVolume(volumeInfo.getId(), volumeInfo.getDataStore()); -volumeInfo = _volumeDataFactory.getVolume(volumeInfo.getId(), volumeInfo.getDataStore()); +volumeInfo.processEvent(Event.MigrationRequested); -HostVO hostVO = getHost(snapshotInfo.getDataStore().getId()); +volumeInfo = _volumeDataFactory.getVolume(volumeInfo.getId(), volumeInfo.getDataStore()); -String value = _configDao.getValue(Config.PrimaryStorageDownloadWait.toString()); -int primaryStorageDownloadWait = NumbersUtil.parseInt(value, Integer.parseInt(Config.PrimaryStorageDownloadWait.getDefaultValue())); -CopyCommand copyCommand = new CopyCommand(snapshotInfo.getTO(), volumeInfo.getTO(), primaryStorageDownloadWait, VirtualMachineManager.ExecuteInSequence.value()); +if (useCloning) { +copyCmdAnswer = performResignature(volumeInfo, hostVO); +} +else { +// asking for a XenServer host here so we don't always prefer to use XenServer hosts that support resigning +// even when we don't need those hosts to do this kind of copy work +hostVO = getXenServerHost(snapshotInfo.getDataCenterId(), false); -CopyCmdAnswer copyCmdAnswer = null; +copyCmdAnswer = performCopyOfVdi(volumeInfo, snapshotInfo, hostVO); +} -try { -_volumeService.grantAccess(snapshotInfo, hostVO, snapshotInfo.getDataStore()); -_volumeService.grantAccess(volumeInfo, hostVO, volumeInfo.getDataStore()); +if (copyCmdAnswer == null || !copyCmdAnswer.getResult()) { +if (copyCmdAnswer != null && copyCmdAnswer.getDetails() != null && !copyCmdAnswer.getDetails().isEmpty()) { +errMsg = copyCmdAnswer.getDetails(); +} +else { +errMsg = "Unable to perform host-side operation"; +} +} +} +catch (Exception ex) { +errMsg = ex.getMessage() != null ? ex.getMessage() : "Copy operation failed"; +} -Map srcDetails = getSnapshotDetails(_storagePoolDao.findById(snapshotInfo.getDataStore().getId()), snapshotInfo); +CopyCommandResult result = new CopyCommandResult(null, copyCmdAnswer); -copyCommand.setOptions(srcDetails); +result.setResult(errMsg); -Map destDetails = getVolumeDetails(volumeInfo); +callback.complete(result); +} -copyCommand.setOptions2(destDetails); +// If the underlying storage system is making use of read-only snapshots, this gives the storage system the opportunity to +// create a volume from the snapshot so that we can copy the VHD file that should be inside of the snapshot to secondary storage. +// +// The resultant volume must be writable because we need to resign the SR and the VDI that should be inside of it before we copy +// the VHD file to secondary storage. +// +// If the storage system is using writable snapshots, then nothing need be done by that storage system here because we can just +// resign the SR and the VDI that should be inside of the snapshot before copying the VHD file to secondary storage. +private void createVolumeFromSnapshot(HostVO hostVO, SnapshotInfo snapshotInfo, boolean keepGrantedAccess) { --- End diff -- @mike-tutkowski Could you please use Javadoc (`/** */`) instead of comments (`// `)? As you already wrote a good documenta
[GitHub] cloudstack pull request: Taking fast and efficient volume snapshot...
Github user GabrielBrascher commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1403#discussion_r52108776 --- Diff: engine/storage/datamotion/src/org/apache/cloudstack/storage/motion/StorageSystemDataMotionStrategy.java --- @@ -255,99 +332,149 @@ private Void handleCreateVolumeFromSnapshotBothOnStorageSystem(SnapshotInfo snap VolumeApiResult result = future.get(); +if (volumeDetail != null) { +_volumeDetailsDao.remove(volumeDetail.getId()); +} + if (result.isFailed()) { s_logger.debug("Failed to create a volume: " + result.getResult()); throw new CloudRuntimeException(result.getResult()); } -} -catch (Exception ex) { -throw new CloudRuntimeException(ex.getMessage()); -} - -volumeInfo = _volumeDataFactory.getVolume(volumeInfo.getId(), volumeInfo.getDataStore()); -volumeInfo.processEvent(Event.MigrationRequested); +volumeInfo = _volumeDataFactory.getVolume(volumeInfo.getId(), volumeInfo.getDataStore()); -volumeInfo = _volumeDataFactory.getVolume(volumeInfo.getId(), volumeInfo.getDataStore()); +volumeInfo.processEvent(Event.MigrationRequested); -HostVO hostVO = getHost(snapshotInfo.getDataStore().getId()); +volumeInfo = _volumeDataFactory.getVolume(volumeInfo.getId(), volumeInfo.getDataStore()); -String value = _configDao.getValue(Config.PrimaryStorageDownloadWait.toString()); -int primaryStorageDownloadWait = NumbersUtil.parseInt(value, Integer.parseInt(Config.PrimaryStorageDownloadWait.getDefaultValue())); -CopyCommand copyCommand = new CopyCommand(snapshotInfo.getTO(), volumeInfo.getTO(), primaryStorageDownloadWait, VirtualMachineManager.ExecuteInSequence.value()); +if (useCloning) { +copyCmdAnswer = performResignature(volumeInfo, hostVO); +} +else { +// asking for a XenServer host here so we don't always prefer to use XenServer hosts that support resigning +// even when we don't need those hosts to do this kind of copy work +hostVO = getXenServerHost(snapshotInfo.getDataCenterId(), false); -CopyCmdAnswer copyCmdAnswer = null; +copyCmdAnswer = performCopyOfVdi(volumeInfo, snapshotInfo, hostVO); +} -try { -_volumeService.grantAccess(snapshotInfo, hostVO, snapshotInfo.getDataStore()); -_volumeService.grantAccess(volumeInfo, hostVO, volumeInfo.getDataStore()); +if (copyCmdAnswer == null || !copyCmdAnswer.getResult()) { +if (copyCmdAnswer != null && copyCmdAnswer.getDetails() != null && !copyCmdAnswer.getDetails().isEmpty()) { +errMsg = copyCmdAnswer.getDetails(); +} +else { +errMsg = "Unable to perform host-side operation"; +} +} +} +catch (Exception ex) { +errMsg = ex.getMessage() != null ? ex.getMessage() : "Copy operation failed"; +} -Map srcDetails = getSnapshotDetails(_storagePoolDao.findById(snapshotInfo.getDataStore().getId()), snapshotInfo); +CopyCommandResult result = new CopyCommandResult(null, copyCmdAnswer); -copyCommand.setOptions(srcDetails); +result.setResult(errMsg); -Map destDetails = getVolumeDetails(volumeInfo); +callback.complete(result); +} -copyCommand.setOptions2(destDetails); +// If the underlying storage system is making use of read-only snapshots, this gives the storage system the opportunity to +// create a volume from the snapshot so that we can copy the VHD file that should be inside of the snapshot to secondary storage. +// +// The resultant volume must be writable because we need to resign the SR and the VDI that should be inside of it before we copy +// the VHD file to secondary storage. +// +// If the storage system is using writable snapshots, then nothing need be done by that storage system here because we can just +// resign the SR and the VDI that should be inside of the snapshot before copying the VHD file to secondary storage. +private void createVolumeFromSnapshot(HostVO hostVO, SnapshotInfo snapshotInfo, boolean keepGrantedAccess) { +SnapshotDetailsVO snapshotDetails = handleSnapshotDetails(snapshotInfo.getId(), "tempVolume", "create"); -copyCmdAnswer =
[GitHub] cloudstack pull request: Taking fast and efficient volume snapshot...
Github user GabrielBrascher commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1403#discussion_r52108928 --- Diff: plugins/hypervisors/xenserver/src/com/cloud/hypervisor/xenserver/resource/CitrixResourceBase.java --- @@ -168,7 +168,9 @@ public abstract class CitrixResourceBase implements ServerResource, HypervisorResource, VirtualRouterDeployer { public enum SRType { -EXT, FILE, ISCSI, ISO, LVM, LVMOHBA, LVMOISCSI, NFS; +// RELVMOISCSI = used for resigning metadata (like SR UUID and VDI UUID when a +// particular storage manager is installed on a XenServer host (for back-end snapshots to work)) +EXT, FILE, ISCSI, ISO, LVM, LVMOHBA, LVMOISCSI, RELVMOISCSI, NFS; --- End diff -- @mike-tutkowski Sorry if I am being too repetitive. As an idea, this comments might serve well as a Javadoc block documenting the enum class. If you are friendly with this idea, the same could be done with the **org.apache.cloudstack.engine.subsystem.api.storage.DataStoreCapabilities** enum class. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] cloudstack pull request: CLOUDSTACK-9120 READ.ME files describing ...
Github user GabrielBrascher commented on the pull request: https://github.com/apache/cloudstack/pull/1202#issuecomment-180890324 LGTM. Based on the lack of code. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] cloudstack pull request: Bug-ID: CLOUDSTACK-8870: Skip external de...
Github user GabrielBrascher commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/846#discussion_r52110260 --- Diff: server/src/com/cloud/network/ExternalDeviceUsageManagerImpl.java --- @@ -342,6 +342,12 @@ public ExternalDeviceNetworkUsageTask() { @Override protected void runInContext() { +//Check if there are any external deivces +//Skip external device usage collection if none exist --- End diff -- @kishankavala Could you please change from "deivces" to "devices"? Also, you can use this commented lines as a Javadoc block describing the runInContext() method (if you think it would improve your code). Except that typo, your code seems ok. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
best practices in usage server
Hi, I have a 2 node cluster of CS with a separated MariaDB cluster based on Galera as database. What is your recommendation in installing usage server of CS? Should I install it on a separated server or on CS nodes? Is it necessary to install in on both CS nodes? Regards