Re: Problem with CLOUDSTACK-10240 (Cannot migrate local volume to shared storage)

Rafael Weingärtner Mon, 16 Jul 2018 04:15:01 -0700

Ok, I see what happened there with the migration to cluster. When I re-did
the code I did not have this case. And therefore, in the old code, I was
not seeing this use case (convoluted code, lack of documentation, and so
on; we all know the story). I will fix it.


Regarding the managed storage issue, can you describe the “special
handling” you need?

Are you talking about this:

> if (destPool.getId() == currentPool.getId()) {
>     volumeToPoolObjectMap.put(volume, currentPool);
> } else {
>      throw new CloudRuntimeException("Currently, a volume on managed
> storage can only be 'migrated' to itself.");
> }
>


That is a simple validation, right? A validation to throw an exception if
the user tries to migrate the volume to some other storage pool. Is that
it? If that is the case, the default method
“getDefaultMappingOfVolumesAndStoragePoolForMigration” already takes care
of this. Meaning, that it will not try to move the volume to other storage
pool.

On the other hand, we need to add a validation in the
“createMappingVolumeAndStoragePoolEnteredByUser” method then.
I will wait for your feedback before starting to code. Thanks for spotting
this issue.

On Sun, Jul 15, 2018 at 9:11 PM, Tutkowski, Mike <mike.tutkow...@netapp.com>
wrote:

> Hi Rafael,
>
> Thanks for your time on this.
>
> Here is an example where the new code deviates from the old code in a
> critical fashion (code right below is new):
>
>     private Map<Volume, StoragePool> getDefaultMappingOfVolumesAndS
> toragePoolForMigration(VirtualMachineProfile profile, Host targetHost) {
>         Map<Volume, StoragePool> volumeToPoolObjectMap = new
> HashMap<Volume, StoragePool>();
>         List<VolumeVO> allVolumes = _volsDao.findUsableVolumesForInstance(
> profile.getId());
>         for (VolumeVO volume : allVolumes) {
>             StoragePoolVO currentPool = _storagePoolDao.findById(
> volume.getPoolId());
>             if (ScopeType.HOST.equals(currentPool.getScope())) {
>                 createVolumeToStoragePoolMappingIfNeeded(profile,
> targetHost, volumeToPoolObjectMap, volume, currentPool);
>             } else {
>                 volumeToPoolObjectMap.put(volume, currentPool);
>             }
>         }
>         return volumeToPoolObjectMap;
>     }
>
> What happens in the new code (above) is if the user didn’t pass in a
> storage pool to migrate the virtual disk to (but the VM is being migrated
> to a new cluster), this code just assigns the virtual disk to its current
> storage pool (which is not going to be visible to any of the hosts in the
> new compute cluster).
>
> In the old code (I’m looking at 4.11.3 here), you could look around line
> 2337 for the following code (in the VirtualMachineManagerImpl.
> getPoolListForVolumesForMigration method):
>
>                     // Find a suitable pool for the volume. Call the
> storage pool allocator to find the list of pools.
>
>                     final DiskProfile diskProfile = new
> DiskProfile(volume, diskOffering, profile.getHypervisorType());
>                     final DataCenterDeployment plan = new
> DataCenterDeployment(host.getDataCenterId(), host.getPodId(),
> host.getClusterId(),
>                             host.getId(), null, null);
>
>                     final List<StoragePool> poolList = new ArrayList<>();
>                     final ExcludeList avoid = new ExcludeList();
>
>                     for (final StoragePoolAllocator allocator :
> _storagePoolAllocators) {
>                         final List<StoragePool> poolListFromAllocator =
> allocator.allocateToPool(diskProfile, profile, plan, avoid,
> StoragePoolAllocator.RETURN_UPTO_ALL);
>
>                         if (poolListFromAllocator != null &&
> !poolListFromAllocator.isEmpty()) {
>                             poolList.addAll(poolListFromAllocator);
>                         }
>                     }
>
> This old code would find an applicable storage pool in the destination
> cluster (one that can be seen by the hosts in that compute cluster).
>
> I think the main error in the new logic is the assumption that a VM can
> only be migrated to a host in the same computer cluster. For XenServer
> (perhaps for other hypervisor types?), we support cross-cluster VM
> migration.
>
> The other issue I noticed is that there is no logic in the new code that
> checks for managed-storage use cases. If you look in the
> VirtualMachineManagerImpl.getPoolListForVolumesForMigration method in the
> old code, there is special handling for managed storage. I don’t see this
> reproduced in the new logic.
>
> I sympathize with your point that all tests passed yet this issue was not
> uncovered. Unfortunately, I suspect we have a fairly low % coverage of
> automated tests on CloudStack. If we ever did get to a high % of automated
> test coverage, we might be able to spin up new releases more frequently. As
> the case stands today, however, there are probably many un-tested use cases
> when it comes to our automated suite of tests.
>
> Thanks again!
> Mike
>
> On 7/15/18, 4:19 PM, "Rafael Weingärtner" <rafaelweingart...@gmail.com>
> wrote:
>
>     Mike, are you able to pin-point in the old/replaced code the bit that
> was
>     handling your use case?  I took the most care not to break anything.
>     Also, your test case, isn't it in the ACS' integration test suite? In
>     theory, all test passed when we merged the PR.
>
>     I sure can take a look at it. Can you detail your use case? I mean, the
>     high level execution flow. What API methods you do, what you expected
> to
>     happen, and what is happening today.
>
>     On Sun, Jul 15, 2018 at 3:25 AM, Tutkowski, Mike <
> mike.tutkow...@netapp.com>
>     wrote:
>
>     > It looks like this is the problematic PR:
>     >
>     > https://github.com/apache/cloudstack/pull/2425/
>     >
>     > On 7/15/18, 12:20 AM, "Tutkowski, Mike" <mike.tutkow...@netapp.com>
> wrote:
>     >
>     >     Hi,
>     >
>     >     While running managed-storage regression tests tonight, I
> noticed a
>     > problem that is not related to managed storage.
>     >
>     >     CLOUDSTACK-10240 is a ticket asking that we allow the migration
> of a
>     > virtual disk that’s on local storage to shared storage. In the
> process of
>     > enabling this feature, the VirtualMachineManagerImpl.
>     > getPoolListForVolumesForMigration method was re-written in a way
> that
>     > completely breaks at least one use case: Migrating a VM across
> compute
>     > clusters (at least supported in XenServer). If, say, a virtual disk
> resides
>     > on shared storage in the source compute cluster, we must be able to
> copy
>     > this virtual disk to shared storage in the destination compute
> cluster.
>     >
>     >     As the code is currently written, this is no longer possible. It
> also
>     > seems that the managed-storage logic has been dropped for some
> reason in
>     > the new implementation.
>     >
>     >     Rafael – It seems that you worked on this feature. Would you be
> able
>     > to look into this and create a PR?
>     >
>     >     Thanks,
>     >     Mike
>     >
>     >
>     >
>
>
>     --
>     Rafael Weingärtner
>
>
>


-- 
Rafael Weingärtner

Re: Problem with CLOUDSTACK-10240 (Cannot migrate local volume to shared storage)

Reply via email to