Kunalbehbud opened a new pull request, #13084:
URL: https://github.com/apache/cloudstack/pull/13084
### Description
This PR fixes live storage migration for KVM instances whose migrated
volumes are backed by direct-download templates.
The problematic path is linked-clone live storage migration. For that mode,
libvirt expects the rest of the backing chain to already exist on the
destination and to match the source. That assumption is not safe for
direct-download backed volumes, where the template may have been bypassed or
staged directly on primary storage. When such a volume is part of the actual
migration set, this PR forces the KVM migration request to use full-clone
storage migration instead.
A few related edge cases are handled in the same path:
- the full-clone decision ignores volumes that are skipped from the
migration request, such as same-pool PowerFlex volumes or volumes rejected by
`shouldMigrateVolume`
- `VolumeDataFactoryImpl` now carries the template `directDownload` flag
consistently across its `getVolume(...)` variants, including volumes whose
template was removed later
- copied template references are updated instead of blindly persisted, and
missing source references now fail with a useful error instead of a later NPE
- `ModifyTargetsCommand` failures preserve the agent error details and empty
connected-path answers fail with a clearer exception
This keeps the change scoped to the KVM storage migration bug. It does not
try to mix linked-clone and full-clone per disk, since the current KVM/libvirt
migration command chooses the storage migration mode for the VM migration
request as a whole.
### Types of changes
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] New feature (non-breaking change which adds functionality)
- [x] Bug fix (non-breaking change which fixes an issue)
- [ ] Enhancement (improves an existing feature and functionality)
- [ ] Cleanup (Code refactoring and cleanup, that may add test cases)
- [ ] Build/CI
- [ ] Test (unit or integration test code)
### Feature/Enhancement Scale or Bug Severity
#### Feature/Enhancement Scale
- [ ] Major
- [ ] Minor
#### Bug Severity
- [ ] BLOCKER
- [ ] Critical
- [x] Major
- [ ] Minor
- [ ] Trivial
### Screenshots (if appropriate):
N/A. This is a backend KVM storage migration fix.
### How Has This Been Tested?
Targeted unit tests:
```bash
mvn -pl engine/storage/datamotion,engine/storage/volume -am \
-Dtest=StorageSystemDataMotionStrategyTest,KvmNonManagedStorageSystemDataMotionTest,VolumeDataFactoryImplTest
\
-DfailIfNoTests=false -Dsurefire.failIfNoSpecifiedTests=false test
```
Result: 63 tests run, 0 failures, build success.
Manual verification was also done in a two-host KVM 4.22 test environment:
- registered a direct-download Rocky Linux template
- deployed an instance from that template
- live migrated the instance with storage from host A to host B
- live migrated it back from host B to host A
- confirmed the VM stayed running after each migration
- confirmed management logs showed the expected full-clone fallback for the
direct-download backed volume
#### How did you try to break this feature and the system with this change?
Covered the boundary cases that are most likely to regress this path:
- mixed migration requests with a direct-download volume and a normal volume
- direct-download volumes that are skipped and should not force the rest of
the request to full clone
- existing and missing copied template references
- destination template references whose backing path differs from the source
backing path
- `ModifyTargetsCommand` returning a generic failed `Answer`
- `ModifyTargetsAnswer` returning no connected paths
- volumes with no template ID, to avoid unnecessary template lookups
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]