Re: Problem migrating big volume between primary storage pools.

Bryan Lima Wed, 26 Apr 2023 12:59:10 -0700

Hey Jorge,

Nice to see another fellow around!

Both methods didn't work with a volume of 1.1 TB. Do they do the samething?

Both methods have different validations; however, essentially they dothe same thing: while the VM is stopped, the volume is copied to thesecondary storage and then to the primary storage. On the other hand,when the VM is running, ACS copies the volume directly to thedestination pool. Could you try migrating these volumes while the VM isstill running (using API *migrateVirtualMachineWithVolume*)? In thisscenario, the migration would not copy the volumes to the secondarystorage; thus, it would be faster and reduce the stress/load in yournetwork and storage systems. Let me know if this option worked for youor if you have any doubts about how to use the live migration with KVM.

Besides that, we have seen some problems when this migration process isnot finished properly, which leaves leftovers in the storage pool,consuming valuable storage resources and database inconsistencies. It isworth taking a look at the storage pool for these files and alsovalidating the database, to see if inconsistencies were created there.


Best regards,
Bryan
On 26/04/2023 16:24, Jorge Luiz Correa wrote:

Anyone had problems when migrating "big" volumes between different pools? I
have 3 storage pools. The overprovisioning factor was configured with 2.0
(default) and pool2 got full. So, I've configured factor as 1.0 and then
had to move some volumes from pool2 to pool3.

CS 4.17.2.0, Ubuntu 22.04 LTS. I'm using KVM with NFS. Same zone, same pod,
same cluster. All hosts (hypervisors) had all 3 pools mounted. I've tried
two ways:

1) from instance details page, with instance stopped, using the option
"Migrate instance to another primary storage" (when instance is running
this option is named "Migrate instance to another host"). Then, I've marked
"Migrate all volume(s) of the instance to a single primary storage" and
choose the destination primary storage pool3.

2) from volume details page, with instance stopped, using the option
"Migrate volume" and then selecting the destination primary storage pool3.

Both methods didn't work with a volume of 1.1 TB. Do they do the same thing?

Looking at the host that executes the action, I can see that it mounts the
Secondary Storage, starts a "qemu-img convert" process to generate a new
volume. After some time (3 hours) and copy 1.1 TB, the process fail with:

com.cloud.utils.exception.CloudRuntimeException: Resource [StoragePool:8]
is unreachable: Migrate volume failed:
com.cloud.utils.exception.CloudRuntimeException: Failed to copy
/mnt/4be0a812-1d87-376f-9e72-db79206a796c/565fa2dd-ff14-4b28-a5d0-dbe88b860ee9
to d3d5a858-285c-452b-b33f-c152c294711b.qcow2

I checked in the database that StoragePool:8 is pool3, the destination.

After failing, the async job is finished. But, the new qcow2 file remains
at secondary storage, lost.

So, the host is saying it can't access the pool3. BUT, this pool is
mounted! There are other VMs running using this pool3. And, I've
successfully migrated many others VMs using 1) or 2), but these VMs had up
to 100 GB.

I'm using

job.cancel.threshold.minutes: 480
migratewait: 28800
storage.pool.max.waitseconds: 28800
wait: 28800

so, no log messages about timeouts.

Any help?

Thank you :)

Re: Problem migrating big volume between primary storage pools.

Reply via email to